Calculate with the fewest cell | Model Optimization Team series
Hello! Welcome to the Model Optimization team's monthly blog post. This is an eight-part series, following the topics we see occurring most frequently in slow customer models. Our aim is to inform and guide how to build with best practice intentions for these model building topics.
Each month, two of our analysts will take you through how to spot and improve these formula structures! We hope these can help with your model performance.
All the best - Cianna Bramley (Model Optimization Manager)
……
This month's blog is brought to you by Mark Lawrence Del Mundo and Lip Chean Soh.
When developing line items and modules, a crucial question to consider is: “Which components of this line item expression truly influence the calculation?” By focusing on the necessary dimensions in your formulas, you can significantly enhance performance.
This principle is particularly relevant in the context of PLANS, the standard framework for Anaplan modeling. It specifically pertains to the P — Performance and N — Necessary subtopics. Utilizing only the necessary dimensions in your calculations not only streamlines the process but also optimizes the overall efficiency of your model.
The Problem With over-dimensioned line items
Performance impact
- Unnecessary calculations: When a line item contains dimensions that aren't needed for the core calculations, any formulas that reference these dimensions may perform redundant calculations. This not only wastes computational resources but can also slow down processing times. If the model has to evaluate more data than necessary, it can lead to longer opening times and recalculations when needed.
- Increased model size: Every additional dimension adds to the overall size of the data model. A larger model can consume more memory and require more storage, making it less efficient. This can impact not only the performance of the model itself but also the workspace allocation.
User experience challenges
- Complexity of context selectors: End-users often rely on context selectors to filter and navigate data. When a module is cluttered with unnecessary dimensions, users may face a bewildering array of options. This complexity can make it difficult for users to find the relevant context for their analyses, leading to frustration.
- Learning curve: For new users, an overly complicated model can create a steep learning curve. They may need additional training to understand which dimensions are necessary and which can be ignored. This may result in reduction of overall productivity.
Identifying over-dimensioned line items
Single necessary dimension
Here you can see on the regular view (grid view) that the result of item and name does not change with respect to the different items of Regions list (Americas, Europe, Asia Pacific).
Multiple necessary dimensions
Here you can see on the regular view (grid view) that the result of LineCode only changes in respect to items with Products list (Chocolates, Sours, Taffy, Fudge) and items in Regions list (Americas, Europe, Asia Pacific). All other items from other dimensions/list for this example Time and Users does not affect the value of the line item LineCode.
The Anaplan approach: Calculate with the fewest cell
Optimizing single necessary dimension
So, observing the first example via Blueprint view, we would see that this module has two dimensions, i.e. Products, Regions.
As mentioned earlier the values only change with respect to the items of the Products list. To optimize this module we will remove the dimension that does not affect the result of the line items. For this case we will remove Regions list in the dimension used.
Here is the optimized Blueprint view.
Here is the optimized grid view.
Additionally key lists in the model should each have their own system module with only the one list as a dimension, hence the SYS Prefix.
Optimizing multiple necessary dimension
Observing the second example via Blueprint view, we would see that this module has four dimensions — i.e. Users, Products, Regions, and Month.
For this example, the dimensions that truly affect the line items are not obvious. Hence what we can do is to list the dimensions used by the components of the line item and make sure that we only use the combined dimensions of these components.
For this case the formula is:
'SYS 002 Region'.'Name Concat1' & 'SYS 001 Products'.Name
The dimension used for 'SYS 002 Region'.'Name Concat1' is ‘Regions’.
The dimension used for 'SYS 001 Products'.Name is ‘Products’.
Thus the Dimensions used for this formula should only be Products and Regions.
To optimize this module we will need to remove the dimensions Users and Month (circled in red).
Below is the optimized Blueprint view.
And the optimized grid view.
Performance metrics comparison
Model calculation time head-to-head:
The bad dimension model's total calculation time is 3.56 ms while the good dimension model's total calculation time is 0.31 ms — an improvement of 91.29%.
Cell count head-to-head:
Not only does this improve calculation and model open times, this is one of the principles that also decreases size while improving performance. The bad dimension model's cell count is 912 cells while the good dimension model's cell count is 26 cells — a reduction of cell count by 97.15%.
Summary
Minimizing calculations by only including the necessary dimensions can lead to significant performance improvements. This best practice — calculate with the fewest cell — removes unnecessary complexity, enables faster calculations and a more responsive system, ultimately leading to better decision-making and improved outcomes.
Tips
While the ‘Calculate with the fewest cell’ infraction can happen with any formula, some short Anaplan keywords that model builders are prone to committing this infraction include:
CODE(ITEM(X))
NAME(ITEM(X))
PARENT(ITEM(X))
FINDITEM(X,”Y”)
Where X is a list, and Y is a hard-coded text string.
Questions? Leave a comment!
……
Check out the other articles in the series: