Performance Comparison - Direct SUM vs Line Item Subset

luke_e · June 2023

Hi all,

Was looking to get some community feedback on two different SUM approaches and which one would be more efficient from a model engine point of view.

The scenario,

I'm looking to SUM line items into a central P&L by account. The source module lines do not have account as a list, so we'll be using something similar to [SUM: Mapping Module.Account] to map the line items to an account.

I will stress that this is more for examples where there's potentially a dozen or more lines to SUM.

The options,

Option 1. Directly SUM line item from source module to central P&L, e.g.

Line 1[SUM: Mapping Module.Account] + Line 2[SUM: Mapping Module.Acount] + Line 3[SUM: Mapping Module.Account] + Line 4[SUM: Mapping Module.Account] .. etc

Option 2. Create an LIS with the relevant lines, setup a COLLECT() & mapping module using the LIS and then SUM into the central P&L using a single function, e.g.

COLLECT Source[SUM: Collect Module Mapping.Account]

The outcome,

Option 1 is the most simplistic approach as the SUM is direct from source to target, but will result in potentially 10+ SUM parameters in the logic as there's so many line items to include.

Option 2 is a bit more complicated to setup initially, requiring an LIS and two additional modules (one to use COLLECT() AND one to map each line to an account), but results in a much cleaner calculation outcome once it's included in the central P&L (as well as arguably better auditability and flexibility).

Whilst we can debate the different pillars of PLANS, I'm looking more to the P for this particular query. Looking forward to hearing your thoughts on the different approaches (or alternatives if appropriate).

Cheers,

L

rob_marshall · June 2023

@luke_e

Thank you for this post and question…Short answer is absolutely Option 2 as HyperBlock actually works less but faster as there is only "one" sum formula getting the data from the collect() statement vs. Option 1 there are several sum formulas.

Funny thing, I just went over this exact scenario in the ACE training in San Diego a couple of weeks ago in a before and after scenario.

in "your" Option 1, you would have something similar to this

where the vast majority of "detailed" line items have the same formula:

The much better way is your Option 2 where the same line items are now:

Remember, the Collect() and Calc module, keep your sums OFF so there will be less calculations and aggregations. This reporting module should have SUMs turned on for the aggregations.

The Calc module is defined as the following with the LISS in the rows.

Hope this helps,

Rob

BenjaminNiel · June 2023

Hi @luke_e,

I think you also need a LIS with the 1st option to build the mapping module, right? But, just that you would replace the COLLECT() by some SUM functions.

I would say that the 2nd option is more powerful (almost always better when avoiding complex and long formulae), more sustainable (the performance would be even worst if you add line items in the source module) and easier to audit indeed. You can visit this article around LIS, the section "Transformation" is a similar case as the one you are facing.

OEG Best Practice: Line item subsets demystified

AnaplanOEG

Nov 13, 2019

Line item subsets are one of the most powerful and efficient features in Anaplan, yet one of the least understood. The COLLECT() function is probably the only “black box” function within Anaplan as is it not immediately apparent what it is doing or where the source values are coming from. In the following article, I will cover how to understand line item subsets more easily, and also explain their many uses, some of which do not need COLLECT() at all.

For more information on creating line item subsets see line item subsets in Anapedia.

A line-item subset is a list of items drawn from one or more line items from one or more modules. Put simply, it converts line items into a list on which calculations can be performed. There are some restrictions:

Line item subsets can only contain numeric formatted line items.

Only one line item subset can be used as a dimension in a module.

Although line items can contain formulas, the items in a line item subset can only aggregate to a simple subtotal.

Styles on the line items are not transferred over to the line item subset.

Line item subsets can be used for many different areas of functionality. For the examples used, I have based them on the final model from the new Level 1 training. Download the model and follow the instructions to practice on the same structures.

These examples are deliberately simplified, but I hope you find these insightful and easy to transfer into your models to simplify the formulae and provide more flexibility to your users.

Table of Contents:

Calculations on calculations
Transformation
Multiple source modules
Filters
Dynamic cell access
Line items subsets with line item subsets
Version formula
Final thoughts
Additional resources
Calculations on calculations

This is the classic use of line item subsets. A source module contains line items, and subsequently, you need to perform additional calculations on these line items. While in some cases this can be managed through complex formulae, normally these workarounds break most of the best practice guidelines and should be avoided.

Use case example:

The source module contains forecast data with line items for the profit and loss lines in U.S. dollars. We need to convert these values into local currency based on the Country dimension.

The source modules are as follows:

The first step is to create the line item subset, and for this report, we only want summary values.

In the settings tab, choose Line Item Subsets and click insert.

We recommend prefixing with LIS: the name of the module and simple description.

Clicking on the Line Item Subset header item (in settings) will display the Line Item Subsets screen.

Click on the newly created line item subset and the … and select the module(s) required; in this case, it is REP03.

Select which line items you wish to include in the line item subset.

Now that the line item subset has been created, it is available to be used in a module.

Create a module with the following dimensions:
LIS: REP03 Currency

G2 Country

Time (Years)

Add the following line items:
Base Currency

Exchange Rate

Local Currency

In the Base Currency line item, enter the formula: COLLECT()

Note that the values are the same as those in REP03 and the line items are now shown in list format (no formatting). Also note that these values are from the Forecast version, as the target module does not have versions, so the Current Version is used as the source automatically.

Add the following formulae to the remaining line items to complete the calculation.
Exchange Rate = 'DATA02 Exchange Rates'.Rate[LOOKUP: 'SYS03 Country Details'.Currency Code]

Local Currency = Base Currency / Exchange Rate

Note that the Exchange Rate line item should be set as a Subsidiary view (excluding the line item subset from the applies to) because we are showing it on the report for clarity. If this display was not required, the calculation could be combined with the Local Currency formula.

Transformation

You can also use a line item subset to help with the transformation between source and target modules.

Use case example:

We want to summarize costs (from the reporting P&L) into Central and Locally controlled costs.

Create a list (Controllable Costs) containing two members.
Central

Local

Create a line item subset (as before) using just REP03 as the source module.

Create a staging module with the following dimensions:
LIS: REP03 Cost Reporting

G2 Country

Time (Years)

Add a line item (Data) and enter COLLECT() as the formula.

Set the Summary method to None; we do not need subtotals in this module.

Create a mapping module, dimensioned by LIS: REP03 Cost Reporting.
Add a line item (Mapping) formatted as the Controllable Costs list.

Map the lines as applicable.

Create a reporting module with the following dimensions.
Controllable Costs

G2 Country

Time (Years)

Add a line item called Costs.

Add the formula: 'REP07 Cost Reporting Staging'.Data[SUM: 'SYS14 Cost Mapping'.Mapping]

We use the SUM formula because the source dimension and the mapping dimension are the same. So, “If the source is the same, it’s a SUM.”

Multiple source modules

Line item subsets can contain line items from multiple modules. There is a caveat though; all modules must share at least one common dimension/hierarchy and/or have a Top Level set for non-matching dimensions.

Use case example:

Based on user-entered settings, we want to compare the values from two time periods for metrics from three different modules and calculate the absolute and % variances.

The source modules all share a common dimension:

REV03 Margin Calculation: G2 Countries, P2 Products, Month

EMP03 Employee Expenses by Country: G2 Countries, Month

OTH01 Non-Employee Expenses: G3 Location, E1 Departments, Month

Note: G3 Location has a G2 Country as a parent

The module for the user parameters is:

And the metrics required are:

Margin

Salary

Bonus

Rent

Utilities

We could solve this problem without using a line item subset:

Create a list (Reporting Metrics) containing the list items above.

Create a module with the following dimensions.
Reporting Metrics

G2 Country

Users

The formula for Month 1 is:

IF ITEM(Reporting Metrics) = Reporting Metrics.Margin THEN 'REV03 Margin Calculation'.Margin[LOOKUP: 'SYS11 Time Variance Reporting'.'Month 1'] ELSE IF ITEM(Reporting Metrics) = Reporting Metrics.Salary THEN 'EMP03 Employee Expenses by Country'.Salary[LOOKUP: 'SYS11 Time Variance Reporting'.'Month 1'] ELSE IF ITEM(Reporting Metrics) = Reporting Metrics.Bonus THEN 'EMP03 Employee Expenses by Country'.Bonus[LOOKUP: 'SYS11 Time Variance Reporting'.'Month 1'] ELSE IF ITEM(Reporting Metrics) = Reporting Metrics.Rent THEN 'OTH01 Non Employee Expenses'.Rent[LOOKUP: 'SYS11 Time Variance Reporting'.'Month 1'] ELSE IF ITEM(Reporting Metrics) = Reporting Metrics.Utilities THEN 'OTH01 Non Employee Expenses'.Utilities[LOOKUP: 'SYS11 Time Variance Reporting'.'Month 1'] ELSE 0

I won’t repeat the formula for Month 2, as it is effectively the same, just referencing the Month 2 line item in the source.

You can see, that even for a small set of metrics, this is a large complex formula, going against best practices. So, let’s not do that.

Create the line item subset as before.

For multi-module line item subsets, it is best practice to use Multi> to represent the various modules.

Open the line item subset and chose the three modules.

Create a staging module (this is best practice following the DISCO principle), with the following dimensions.
LIS: Multi>Variance Reporting

G2 Country

Time (Months)

Add a line item (Data) and enter COLLECT() as the formula.

Set the Summary method to None; we do not need subtotals in this module.

Create a mapping module, dimensioned by Reporting Metrics.
Add a line item formatted LIS: Multi>Variance Reporting.

Map the lines accordingly.

In the reporting module from above, change the Month 1 and Month 2 line item formulae to.
'REP05 Variance Report Staging'.Data[LOOKUP: 'SYS11 Time Variance Reporting'.'Month 1', LOOKUP: 'SYS12a Reporting Metrics Mapping'.Mapping]

'REP05 Variance Report Staging'.Data[LOOKUP: 'SYS11 Time Variance Reporting'.'Month 2', LOOKUP: 'SYS12a Reporting Metrics Mapping'.Mapping]

Note, this time we are using LOOKUP rather than SUM because the source dimension doesn’t match the dimension of the mapping module.

I think you’ll agree that the formula is much easier to read and it is more efficient.

However, we can do even better; but note that there now are two ‘lookups’ in the formula. The more “transformations” there are in the formulae, the more work the engine needs to do. We can remove one of these by changing the target module dimensionality.

Copy the reporting module from above.
Remove the formulae for Month 1 and Month 2.

Replace Reporting Metrics with LIS: Multi>Variance Reporting as the dimension (applies to).

Add the following formulae for Month 1 and Month 2 respectively.
Month 1 = 'REP05 Variance Report Staging'.Data[LOOKUP: 'SYS11 Time Variance Reporting'.'Month 1']

Month 2 = 'REP05 Variance Report Staging'.Data[LOOKUP: 'SYS11 Time Variance Reporting'.'Month 2']

Note, only one lookup is needed in the formula.

Filters

Another use case that line item subsets can be used for is filtering. And this functionality has nothing to do with staging data or mapping modules. It is possible to filter line items and these can also be filtered based on other dimensions too.

Use case example:

Based on user-entered settings, for the reporting module (REP03) we want to show different line items for each year and version.

We already have set up the Years to Versions filter module

We now want to set up the user-driven parameters. To ensure that the users’ settings do not affect each other, we need to use the system generated Users’ list.

Create a line item subset based on REP03

Select all line items

Create a new module with the following dimensions:
LIS: REP03 Filters

Users

Versions

Add a single line item (Show?) formatted as a Boolean

Enter values as you wish

Note that Employee expenses and Other Costs are not available to check. This is because, in REP03, they are a simple aggregation and are shown as Parents of the other line items.

So, how do we resolve this? You can “trick” the model by turning these setting off.

The subtotals are now available to check in the filter module.

It is worth noting, be careful when doing this. If you are using the line item subsets as a dimension in a data entry module, the totals will not calculate correctly. See Final Thoughts for more details.

To set up the filter

In REP03, set the following filters

The module will now filter line items and years when the version page selector is changed.

Note the subtotals work correctly in this module because it is not data entry.

Dynamic cell access

Line item subsets can be used in conjunction with Dynamic Cell Access to provide very fine-grained control over data; again, without any mapping modules or COLLECT() statements

Use case example:

In the following module

The following rules apply:

Bonus % is set by the central team so it needs to be read only.

All metrics for Exec are not allowed to be edited.

Car Allowances are not applicable for Production.

Phone Allowances are not applicable for Production, Finance or HR, and the allowances for Sales should be read only.

To set up the access:

Create a line item subset based on EMP01

Select all line items

Create an Access Driver module with the following dimensions:
LIS: EMP01 DCA

G2 Country

E1 Departments

Add two Boolean formatted line items
Read?

Write?

Enter the values as below

Now in EMP01 assign the Read Access and Write Access drivers to the module

The module now looks like this:

Line items subsets with line item subsets

I mentioned at the outset that you can lose formatting when using a line item subset. However, in some cases, it is possible to keep formatting along with calculations

Use case example:

Using the values from REP03, we want to classify Sales and Costs and then calculate a cost % to Sales. Yes, we could do this in the module itself as a separate line item, but we also want to be able to reclassify the source line items from a dashboard using mappings rather than change the blueprint formula. We also want to maintain formatting.

For this example, I have just changed the styles to illustrate the point

Create a line item subset based on REP03.

Create a staging module with the following dimensions:
LIS: REP03 Cost%

G2 Country

Time (Years)

Add a line item call Data and enter COLLECT() as the formula and set the Summary method to None.

Create a second line item subset based on REP10 (the target module).

Create a mapping module dimensioned by the LIS: REP03 Cost%

Create a line item formatted as LIS: REP10

Map the lines accordingly

In the target module set following formula for both Sales and Costs line items (Yes, it is the same formula for both!)
'REP09 LISS vs LISS - Staging'.Data[SUM: 'SYS20 Cost% Mapping'.Mapping]

Note the formatting is preserved.

Version formula

Finally, I want to mention a piece of functionality that is not well known but very powerful; Version Formula. Utilizing line item subsets in conjunction with versions, Version Formula extends the ‘Formula scope” functionality. It is possible to control formulae using Formula Scope, but there are limited options.

Use case example:

Let’s assume that we have actuals data in one module, the budget data in another and we want to enable the forecast to be writeable. The current version (in the versions setting) is set to Forecast

For this example, there is only one line item in the target module, but this functionality allows the flexibility to set different rules per version for each line item

Create a line item subset based on the above and select the line item(s).

Now in the blueprint view of the target module click Edit>Add Version Formula.

Now choose the Version to which the formula applies.

You will now see a different formula bar at the top of the blueprint view.

Enter the following formula:
'DATA01 P&L Actuals & Budget'.Revenue

Repeat the above for Budget with the following formula:
'REV03 Margin Calculation'.Revenue

Note that now at the top, you can see that there is a Version Formula set.

Final thoughts

We mentioned the aggregation behavior and the ‘Is Summary’ setting earlier. Let me show you how this and the construction of the formulas affect the behavior of the line item subset

We will use the following module as an example. This module is only used to set up the line item subset, so no dimensions are needed.

Note that the subtotal formulae are simple aggregations.

This means the subtotal lines:

Calculate correctly when used as a dimension in a module.

Are not available for data entry.

The following module is dimensioned by the line item subset to highlight 1. and 2. above.

If we decide we don’t want the Employee costs in the line item subset, two things happen:

The indentation changes for the detailed cost lines because they are now not part of a parent hierarchy on display.

The Costs subtotal doesn’t calculate. This is because the Costs subtotal needs the intermediate subtotals to exist within the line item subset.

To mitigate the latter point there are two remedies.

Include the subtotals and hide them – The lines are still calculating and taking space.

If possible, adjust the formula structure.
Remove the subtotals formula.

Add in the Costs formula as to use the detailed items; no intermediate totals.

Re-add the subtotal formulas.

Note the 'Parent' and 'Is Summary' settings, the Costs subtotal now calculates correctly.

If we change the formulae to be something other than simple addition, you will see that calculation is fine in the source module,

but not in the line item subset module.

Why is this?

Remember the 'Is Summary' setting we changed in the Filters section when we adjusted the formula the 'Is Summary' is now unchecked

This means that the line item subset doesn’t treat the line as a calculation, hence the data entry 0 shown instead.

If your costs need to be positive (as in this example), it is possible to calculate correctly using a ratio formula. This works for normal line items/lists as well as line item subsets. See Changing the sign for Aggregation for more details

Additional resources

https://www.youtube.com/watch?v=1Bo4f-ccS14

Author David Smith.

Hope this helps!

luke_e · June 2023

@BenjaminNiel Thanks for the response, I'll have a squiz at the link. I can see a lot of use cases for an LIS but a lot of our models were built out prior to LIS functionality being introduced, so it's just a case of sussing out the best ones.

In my option 1, we didn't need an LIS as we had a mapping module with each account as a line item (with a format of list:account), and we'd just do a SUM using the relevant lines.

@rob_marshall Your example was pretty much right on with what I'm dealing with, albeit mine is much worse as far as the number of [SUM: X] parametres in each line, hah.

I guess the other benefit with option 2 is that there's no calculation change required if additional line items became in scope for the central P&L; we'd just need to tick a box, deploy and map the account.

Appreciate the detailed response and glad to know the COLLECT() avenue is the better approach.

Cheers,

L

CalebLee8 · March 2024

@rob_marshall I have a similar situation, but I'm hitting a roadblock & wanted to get your thoughts.

I have a raw data module with 30k deals where I need to sum various values (Up-for-renewal (UFR), QTD Closed, Forecast $, etc) into a module with 3 dimensions (there would be 3 SUM parameters for each value being summed). So far, I have done the following:

Added the value LIs in the raw data module to a LISS
Created a COLLECT module to pull the values into the LISS
Created a CALC module with the 3 dimensions & the LISS applied to sum from the collect module.

However, I then need to get those values from the LISS back into Line Item format, so I can apply additional calculations on some of the values, as well as use them as data inserted into text cards (unable to select different LISS context for each data point on a card).

Currently, I'm just duplicating the module in my step 3 above but with Line Items instead of the LISS, then selecting the LISS item relevant for each item. For example UFR = CALC.Amount[select: LISS.UFR]. I see in the Calc module in your solution above you're able to reference the module with the LISUB directly into Line Items; how is that working?

Any feedback would be appreciated.

Thank you,

Caleb

rob_marshall · March 2024

@CalebLee8

In your SYS module that has UFR (the list members), create a line item formatted to the LISS (LISS Mapping)…this is where you can do the lookup, if I am understanding you correctly.

CALC.Amount[lookup: SYS List.LISS Mapping]

CalebLee8 · March 2024

@rob_marshall Not sure I understand. Here are some screenshots for clarity.

Here is the raw data module; UFR $ and Renewal $ Forecast are 2 of the 10 metrics I'm needing to sum into a module w/ the first 3 LIs as dimensions.

I then add UFR $ and Renewal Forecast $ to the LISS, & add a collect formula to pull those into the LISS.

Here is the module where I sum the collect formula into those 3 dimensions.

Here is the formula I use in that "Amount" field to sum from the COLLECT LI.

'DAT01 Retention Data'.'Collect - ARR Summary Metrics'[SUM: 'DAT01 Retention Data'.Quarter, SUM: 'DAT01 Retention Data'.'L1', SUM: 'DAT01 Retention Data'.Renewal Type]

I then need to transform the LISS back into line items, so I can apply additional calculations such as Forecasted RR & % UFR Closed, as shown in this screenshot.

Do I need to create a second LISS of the LIs in the CLC03 module, so I can look up the values from the Original LISS without having to select each LISS item?

Thank you,

Caleb

rob_marshall · March 2024

@CalebLee8

I think I understand what you are saying and the answer is yes…Create another LISS for the new line items and then create a SYS LISS2 module that has a mapping line item to the first line item subset. Then you can reference the summed data from CLC03 ARR Summary Consolidated.

Also, I would not have the Collect statement in the first module, have it in another called COL Retention Data as it will be cleaner for the next person.

rob_marshall · March 2024

@CalebLee8

You good?

Performance Comparison - Direct SUM vs Line Item Subset

Welcome!

Best Answer

Answers

OEG Best Practice: Line item subsets demystified

Calculations on calculations

Transformation

Multiple source modules

Filters

Dynamic cell access

Line items subsets with line item subsets

Version formula

Final thoughts

Additional resources

Welcome!

Welcome!

Quick Links

Categories