2.02-08 Never use SUM and LOOKUP together

SUM and LOOKUP used in the same expression generally cause large formula calculations and may cause intermediate relationship calculations especially if Time is a dimension or when the source and target structures are very different.  Splitting them into different modules and line items considerably reduces the size of the intermediate calculations

 

Exception:

2.02-08a The LOOKUP is a constant: Where the LOOKUP line item is a constant, the formula will operate as a SELECT, so the performance implications are negated

 

Best Practice article:
Formulas and their effect on model performance

Tagged:

Comments

  • Misbah
    edited January 16

    Rule 2.02-08 Never use SUM and LOOKUP togetherSUM and LOOKUP used in the same expression generally cause large formula calculations and may cause intermediate relationship calculations especially if Time is a dimension or when the source and target structures are very different.  Splitting them into different modules and line items considerably reduces the size of the intermediate calculations. It goes against PERFORMANCE element of PLANS if you wish to go against this rule.

    Note: This rule is also applicable to other Compound formulae like SELECT & LOOKUP and SUM & SELECT. Since most of the good model builders don’t use SELECT that often we will focus this rule only on SUM & LOOKUP

    Use Case: Let’s say we have a Source module with three dimensions i.e., Customer, Account and Time and we want to do the top down to pull the data into Target module with again three dimensions i.e., Customer, GL Accounts and Time

    Here is how it was done in Pre Planual Era:

    If we look at the dimensions of the modules we know that two of the dimensions don’t match whereas only one dimension match between the two modules

    Source Module: Here Time is in Quarters, hence only Customers is Green or matching, rest two are Red or Not matching

    Target Module: Here Time is in Months

    In order to pull the data from Source Module to Target Module we need to make sure that these two modules talk to each other either by LOOKUPs or Aggregation functions. Because of which there are other SYS modules that are in place which map the differentiating dimensions

    SYS GL Accounts Module: This module shares the link between Accounts list and GL Accounts list

    SYS Time Module: This module holds all the Time formats of the timescale

    Now that we have all the pre-requisites in place, let’s understand the dimensionality

    Source Module Dimensions

    Target Module Dimensions

    Comments

    Customer

    Customer

    Matching

    Time Quarters

    Time Months

    Not Matching but Mapping Available in SYS Time Mod

    Account

    GL Accounts

    Not Matching but Mapping Available in SYS GL Account Mod

     Target Module and Formula:

    What is wrong with this method? SUM & LOOUP being used in the same expression has shown performance degradations and should NEVER be used together. Not only is this the rule but it is one of ten guidelines (Zen) of the Planual as well.

    Here is how it should be done in Planual Way:

    Break up the formula into two line items – Use SUM in the first line in one module and LOOKUP in another line item in a different module

    Create Intermediate Target Module: Use “SUM”

    Create Final Target Module: Use “LOOKUP”

  • Hi, Hope you are doing well. I have a query regarding Usage of SUM & Lookup in same formula. I understand why we should not use SUM & LOOKUP functions in same expression of a formula (eg. X[Lookup: A, SUM:B]).

    But is it the case with using these 2 functions in different expression of same formula too? Eg. X[Lookup: A] + Y[SUM: B]

    Can you please confirm your observation on the same that how it impacts the performance? and should we avoid this kind of formula too?

  • @APKUMARI

    yes, you can have them split like that, the only thing to keep in mind is every time "A" changes (X[Lookup:A]) changes or "B" (Y[sum:B]) changes, the entire logic will get kicked off. Due to this, we recommend splitting them into different line items.

  • I would like to emphasize that this best practice is especially important in the new sparse engine. Combining [SUM & LOOKUP] or [SUM & SELECT] in the Polaris engine must be avoided. Why? In Polaris, we are less constrained by workspace » model size » cell count. The number of cells in sources and targets can and does go into billions, trillions, and higher. Each cell with a formula requires some computation effort. Polaris becomes constrained by calculation effort.

    Use the simple approach described above. "SUMs first, single LOOKUP to finish" and you'll have a much better model.

  • I have a scenario where I am extremelly constrained by cell count and like to avoid created more line items.

    My calculation is concerned with summing a value of a small number of child list items in a source line item so as to calculate the aggregate them so I an allocate the total over the remain child list items. Therefore, the sum and lookup will be using the same mapping.

    Can I get away with combining the SUM and LOOKUP under these circumstances?

  • @ChrisAHeathcote

    You can split it up, but it'll likely take longer to calculate than it would if it were split out.