How to SUM using a dummy list
What is it?
This article explains how to SUM, SUMIF, and use other aggregation methods using a dummy list rather than Anaplan’s native summary functionality.
Why is it useful?
Anaplan's native summary logic is simple to use, elegant, and effective. However, sometimes it results in bloated modules that use up workspace unnecessarily. Using a dummy list, we can aggregate more efficiently, reducing the number of cells used (model size). It will have the most benefit in models with deep hierarchies, particularly where multiple deep hierarchies are used in combination. In certain situations, you will see gigabytes of savings.
How to do it?
In our example we are going to SUM from a data module, with GL Account, Country, and Year to one with just Country and Year without using native summation.
We create our dummy list with the following properties:
- Name: Dummy
- It should have one item ‘Dummy’ with the code ‘DUMMY’ (1.05-02 Always use a code - Anaplan Community)
- It needs to have a top-level item ‘All’ (we use this later on in step 5 to avoid the need for staging line items)
Create a system constant ‘Dummy’ in your constants module that references your list member (this step is not strictly necessary but allows us to follow the best practice of using a constants module rather than directly referencing lists 2.01-09 Use Lookup or Constants Modules - Anaplan Community). The dummy item has the following properties:
- Name: Dummy
- Format: Dummy
- Formula: Dummy.Dummy
- Applies to: None
- Summary behaviour: None
Add our dummy dimension to your system modules (2.01-08 Each major hierarchy should have a System ... - Anaplan Community). Referencing our constant from step 2.
Find some source data you want to sum. Here we are storing annual plan data for different Country / Account / Year combinations. We have summary behavior of ‘None’ applied to the line item.
Note: This example is on a small data set where we only see modest performance gains of 804 cells (about 6 kilobytes). You only need to apply this rule on data sets that are much larger (usually in the tens or hundreds of millions of cells and upwards) where there will be a material impact (100 million cells is around 0.8 gigabytes in a model).
|Cell count difference||Memory used per cell (numeric)||Model size decrease|
Do your calculation. In our example, we are going to get the FY plan total by country and year. First, we create our module as per normal, dimensioning it by Country & Time:
Now we do our special trick. In the applies to of our ‘Plan Total’ line item, we add the dummy list
- The 'applies to' here is just on the individual line item, not the module itself. When you are finished it should look like the screenshot below, with a different applies to for our line item than in the applies to of the module.
- Having an alternate 'applies to' to the module creates a subsidiary view, which goes against the standard guidance in the Planual (2.01-06 Avoid Using Subsidiary Views - Anaplan Community). We are doing this to: a) avoid having the dummy list in the dimension of the target module, and b) to avoid the need for technical staging line items (that add no value to the model as they will never be re-used).
Then we add our formula, where we SUM our source plan data using the dummy item added to the GL account system module:
'DAT02 - Plan Data'.Amount[SUM: 'SYS02 - GL Accounts'.Dummy]
We can see that the summary data is pulled through into our calculation module, and we see our ‘Plan total’ by Country.
- Model size: Avoiding native summation reduces model size, sometimes by gigabytes
- Model cohesiveness: Can enable the re-use of the same calculation rather than repeating it (see the SUMIF example below)
- Model readability: Summing using the dummy dimension isn’t as simple, or as easy to read as directly referencing the line item, particularly for inexperienced modellers or business users drilling down into a formula.
- SUMIFS: You can use the dummy dimension to get conditional totals rather than the whole list. This can mean you don’t have to create repeated line items for different slices of data and is particularly useful for when you can’t use SELECTs (on production lists)
- Combined hierarchies: In our example, we summed from Country / GL Account / Year to Country / Year. But we can take it a step further and get the annual totals for all countries by having a second dummy list ‘dummy 2’ and summing over both:
- Other aggregations: It doesn’t just work for summation, it can be used with other aggregation logic like MIN, MAX, ANY, FIRSTNONBLANK etc. Just be sure to apply the same aggregation logic in the target line item else you may not see what you expect!
Summing over a dummy dimension can be a simple way to achieve some significant reductions in model size. When doing so, be mindful of the potential drawbacks; only use this logic in places where you will see material gains.
Addendum on performance
Which performs better – Native Summation or the Dummy Sum? As is often the case in Anaplan, the answer is 'it depends'. For most cases, the performance difference probably won't be large enough to care about. That said, if you have a large calculation (over billions of cells), there are some competing effects to consider. I suggest that if performance is a concern, you test with both methods to assess any impact.