SUM and LOOKUP | Model Optimization Team series
Hi! Welcome to the Model Optimization team's monthly blog post. This is an eight-part series, following the topics we see occurring most frequently in slow customer models. Our aim is to inform and guide how to build with best practice intentions for these model building topics.
Each month, two of our analysts will take you through how to spot and improve these formula structures! We hope these can help with your model performance.
All the best - Cianna Bramley (Model Optimization Manager)
……
This month's blog is brought to you by Kenneth Privaldos and Lip Chean Soh.
Last Month, we discussed the importance of Calculating Once and Reference and avoiding repeated calculations in a model. This month, we’re going to discuss how the combination of two functions can affect the performance of your model, the use of SUM and LOOKUP together.
In P.L.A.N.S., the standard of Anaplan modeling, this falls under Performance, or the use of correct structures and formulas to optimize the Hyperblock. Furthermore, it is specifically mentioned in the Zen of Anaplan —
“Sums and lookups are good, but never together.”
But why?
Why is this bad?
Referencing Planual 2.02-08, the usage of SUM and LOOKUP together can create a large number of formula calculations which can include slow intermediate ‘relationship’ calculations; as the engine creates virtual mapping modules to calculate these functions.
For a [SUM: X, LOOKUP: Y] format, the Anaplan engine would perform the SUM first before doing the LOOKUP. The format seems simple but here’s the caveat, the SUM is repeated for every LOOKUP. This results in poor performance and inefficient calculations.
Now that we know the “Why?” The next question is “How?”. How do we find this in our model?
Looking for SUM and LOOKUP together
There are 3 methods to find these in your model, and these are:
- Manual inspection
- MAPs
- Model optimization
For manual inspection, you can search the blueprint of the module and look for a line item combining the SUM and LOOKUP. Alternatively, you can download the line items and search for SUM and LOOKUP in your csv file.
You can also request MAPs (Model Analysis Pro-Active Solution), from your Business Partner, which can provide Planual infractions — however this is currently not available for all models (please ask your Business Partner to check).
Lastly, if you request a Model Optimization (Model Performance Analysis) with our team, we check Planual infractions such as SUM - LOOKUP as part of our analysis.
How do we fix these instances in models?
We’ll cover three examples of SUM and LOOKUP together and provide an optimized version of each line item. Generally, to fix the issue, we recommend splitting the SUM and LOOKUP function by using an Intermediate line item. Here’s the reason why:
- By using an intermediate line item, we calculate the functions separately, which stops the engine repeating the functions; like SUM calculating for each LOOKUP.
- The function calculating in the intermediate can be calculated over different dimensionality; in some cases, this can be a smaller cell count.
In most cases, the SUM and LOOKUP are related, and so the SUM would most likely be the intermediate calculation (as this must be calculated before the LOOKUP can be applied). However, if the SUM and LOOKUP act on different dimensions, the LOOKUP could be extracted first (aim to extract the function which gives you an intermediate line item with the smallest cell count to help save space).
Here are the steps in converting a SUM-LOOKUP:
- Create an intermediate line item that will perform the SUM First. This line item should not need aggregation (e.g. ‘Sum’ summary method).
- In the target line item, replace the source with the intermediate line item.
Example 1: SUM and LOOKUP with different mappings and dimensions
Below is an example of a line item that contains SUM and LOOKUP.
Formula Breakdown
Example 1 = 'DAT - Product Quantities'.Qty[SUM: 'SYS P1 Products'.Country, LOOKUP: 'SYS G1 Countries'.Standard Sizing]
Taking a closer look at the formula:
Target Line item = Source [SUM: Mapping 1, LOOKUP: Mapping 2]
To create the intermediate line item, we need to determine the correct dimensions for the SUM. This table contains the dimensions of the line item that we’re going to work with.
Line Items | Format | Applies to: |
---|---|---|
Source (DAT – Product Quantities.Qty) [MOU1] | Number | P1 Products, Products Sizes, Month(Time) |
SUM: Mapping 1 (SYS P1 Products.Country) | List: G1 Countries | P1 Products |
LOOKUP: Mapping 2 (SYS G1 Countries.Standard Sizing) | List: Product Sizes | G1 Countries |
Intermediate Line Item [MOU2] | Number | G1 Countries, Products Sizes, Month(Time) |
Table: Dimensions and format of Intermediate, Source, and Mapping line items in Example 1.
1.1 Create an intermediate module for calculating the SUM
When using the SUM function:
- Add the list used in formatting of the Mapping line item to the dimensions of the target line item.
- Remove the dimension of the mapping line item in the target line item dimensions.
Intermediate Line Item Dimension (SUM) = Source Dimensions – Mapping Dimensions + List Used in formatting (Mapping)
Intermediate Line Item Dimension = P1 Products, Products Sizes, Month(Time) - P1 Products + G1 Countries
Since we know the dimensions for the SUM function, we can now create another module that will contain these dimensions.
Inside the module, create an intermediate line item that will calculate the SUM function.
Intermediate SUM = 'DAT - Product Quantites'.Qty[SUM: 'SYS P1 Products'.Country]
1.2 Reference the intermediate line item in the target module
We can reference the intermediate line item. With this, we optimize the original formula by splitting the SUM and LOOKUP functions.
Example 1 = ‘Intermediate Module 1’.Intermediate SUM [LOOKUP: 'SYS G1 Countries'.Standard Sizing]
Example 2: SUM and LOOKUP with more than 2 mappings
For this example, we’re going to optimize a line item that contains two SUM functions and two LOOKUP functions. It’s similar to our previous example and our goal is to split the SUM and LOOKUP by using an intermediate line item.
Formula Breakdown
Example 2 = DAT Product Cost.Shipping Cost[SUM: 'SYS P1 Products'.Country, SUM: 'SYS P1 Products'.Brand, LOOKUP: 'SYS G1 Countries'.Standard Sizing, LOOKUP: 'SYS P1 Products'.Product Type]
Target Line item = Source [SUM: Mapping 1, SUM: Mapping 2, LOOKUP: Mapping 3, LOOKUP: Mapping 4]
Note: the SUMs appear before the LOOKUPs here, however this same suggestions would apply to other formats, e.g. [SUM: 1, LOOKUP: 3, SUM: 2, LOOKUP: 4] can have the 2 SUMs extracted to an intermediate line item together.
This table shows all the formats and dimensions of the line items we’re going to work with.
Line Items | Format | Applies to |
---|---|---|
Source | Number | P1 Products, Products Sizes, Product Type, Brand, Month(Time)[MOU1] |
SUM: Mapping 1 | List: G1 Countries | P1 Products |
SUM: Mapping 2 | List: Brand | P1 Products |
LOOKUP: Mapping 3 | List: Products Sizes | G1 Countries |
LOOKUP: Mapping 4 | List: Product Type | P1 Products |
Intermediate SUM | Number | G1 Countries, Products Sizes, Brand, Product Type, Month(Time). |
Table: Dimensions and format of Intermediate, Source, and Mapping line item in Example 2.
1.1 Create an intermediate module that will contain the SUM functions
To determine the dimensions, we can still apply this rule:
Intermediate Line Item Dimension (SUM) = Source Dimensions – Mapping Dimensions + List Used in formatting (Mapping)
The Dimensions for this line item are: G1 Countries, Products Sizes, Brand, Product Type, Month(Time).
Intermediate SUM = DAT Product Cost.Shipping Cost[SUM: 'SYS P1 Products'.Country, SUM: 'SYS P1 Products'.Brand]
1.2 Reference the intermediate line item in the target module and perform the LOOKUP function.
Example 2 = ‘Intermediate Module 3’. Intermediate SUM[LOOKUP: 'SYS G1 Countries'.Standard Sizing, LOOKUP: 'SYS P1 Products'.Product Type]
Example 3: SUM and LOOKUP with similar SUM mapping and source line items with the same Applies To dimensions
If we have two line items with the same Applies To dimensions using SUM and LOOKUP with similar mapping in the SUM function, we can extract the SUM and apply the rules for optimizing SUM and LOOKUP and Calculating Once and Reference.
Line Item 1:
Line Item 2:
Target Line Item 1:
Line Items | Format | Applies to: |
---|---|---|
Source DAT Product Cost.Shipping Cost | Number | P1 Products, Products Sizes, Product Type, Brand, Month(Time)[MOU1] |
SUM: Mapping 1 'SYS P1 Products'.Country | List: G1 Countries | P1 Products |
LOOKUP: Mapping 2 'SYS G1 Countries'.Standard Sizing | List: Products Sizes | G1 Countries |
Table: Dimensions and format of Source and Mapping line item in Example 3.
Target Line Item 2:
Line Items | Format | Applies to: |
---|---|---|
Source DAT Product Cost.Shipping Cost | Number | P1 Products, Products Sizes, Product Type, Brand, Month(Time) |
SUM: Mapping 1 'SYS P1 Products'.Country | List: G1 Countries | P1 Products |
LOOKUP: Mapping 2'SYS P1 Products'.Product Type | List: Product Type | P1 Products |
Table: Dimensions and format of Source and Mapping line item in Example 3.
1.1 Create an intermediate module for calculating the SUM function
Create an intermediate line item in another module for calculating the SUM function. The dimensions are: G1 Countries, Product Sizes, Brand, Product Type.
1.2 Reference the Intermediate line item
For target line item 1:
For target line item 2:
Special cases
While it’s generally recommended to avoid using SUM and LOOKUP together, there are a few exceptions for leaving it unchanged:
- LOOKUP as SELECT: When the LOOKUP function acts like a simple value selection. This happens when the LOOKUP value is a constant, or single cell. Hence, the SUM is only performed once. It can be treated like using SUM and SELECT together.
- Very large intermediate line items: Creating an intermediate line item can sometimes results in much larger cell counts due to the required dimensions for the SUM function. If the size of the intermediate line item created heavily impacts your model size, leave the formula unchanged. However it’s good to note that intermediate line items should not need ‘summary’ (aggregation), so this should help keep the cell count as low as possible.
Conclusion
Combining SUM and LOOKUP function together can cause slow models. To solve this, we recommend splitting the SUM and LOOKUP using an intermediate line item. While exceptions exist, it is generally recommended to avoid using SUM and LOOKUP together. This results in a faster calculation because the SUM is not repeated for every LOOKUP.
Questions? Leave a comment!
Comments
-
Thank You for the Insightful Information!
0 -
Thank you this is very helpful.
0 -
This is just as important in Polaris as well - in fact the performance impact can be much worse in Polaris…
0 -
I love that you mention those special cases because they actually happen quite often. And if you just follow the rule - don't combine SUM with LOOKUP and SELECT, you can get a pretty big model which Anaplan clients also don't like a lot 😉
1