PLANS is the new standard for Anaplan modelling; “the way we model”. This will cover more than just the formulas and will include and evolve existing best practices around user experience and data hubs. The initial focus is to develop a set of rules on the structure and detailed design of Anaplan models. This set of rules will provide both a clear route to good model design for the individual Anaplanner, and common guidance on which Anaplanners and reviewers can rely when passing models amongst themselves.
In defining the standard, everything we do will consider or be based around:
Performance – Use the correct structures and formulae to optimize the Hyperblock
Logical – Build the models and formulae more logically – See D.I.S.C.O below
Auditable – Break up formulae for better understanding, performance and maintainability
Necessary – Don’t duplicate expressions, reference data once, no unnecessary calculations
Sustainable – Build with the future in mind, think about process cycles and updates
The standards will be based around three axes:
Performance - How do the structures and formulae impact the performance of the system?
Usability/Auditability - Is the user able to understand how to interact with the functionality?
Sustainability - Can the solution be easily maintained by model builders and support?
We will define the techniques to use that balance the three areas to ensure the optimal design of Anaplan models and architecture
As part of model and module design we recommend categorizing modules as follows:
Data – Data hubs, transactional modules, source data; reference everywhere
Inputs – Design for user entry, minimize the mix of calculations and output
System – Time management, filters, mappings etc.; reference everywhere
Calculations – Optimize for performance (turn summaries off, combine structures)
Outputs - Reporting modules, minimize data flows out
Thinking through the results of a modeling decision is a key part of ensuring good model performance—in other words, making sure the calculation engine isn’t overtaxed. This article highlights some ideas for how to lessen the load on the calculation engine.
Formulas should be simple; a formula that is nested or uses multiple combinations uses valuable processing time. Writing a long, involved formula makes the engine work hard. Seconds count when the user is staring at the screen. Simple is better. Breaking up formulas and using other options helps keep processing speeds fast.
You must keep a balance when using these techniques in your models, so the guidance is as follows:
Break up the most commonly changed formula
Break up the most complex formula
Break up any formula you can’t explain the purpose of in one sentence
Formulas with many calculated components
The structure of a formula can have a significant bearing on the amount of calculation that happens when inputs in the model are changed. Consider the following example of a calculation for the Total Profit in an application. There are five elements that make up the calculation: Product Sales, Service Sales, Cost of Goods Sold (COGS), Operating Expenditure (Op EX), and Rent and Utilities. Each of the different elements are calculated in a separate module. A reporting module pulls the results together into the Total Profit line item, which is calculated using the formula shown below. What happens when one of the components of COGS changes? Since all the source components are included in the formula, when anything within any of the components changes, this formula is recalculated. If there are a significant number of component expressions, this can put a larger overhead on the calculation engine than is necessary.
There is a simple way to structure the module to lessen the demand on the calculation engine. You can separate the input lines in the reporting module by creating a line item for each of the components and adding the Total Profit formula as a separate line item. This way, changes to the source data only cause the relevant line item to recalculate.
For example, a change in the Product Sales calculation only affects the Product Sales and the Total Profit line items in the Reporting module; Services Sales, Op EX, COGS and Rent & Utilities are unchanged. Similarly, a change in COGS only affects COGS and Total Profit in the Reporting module.
Keep the general guidelines in mind. It is not practical to have every downstream formula broken out into individual line items.
Plan to provide early exits from formulas
Conditional formulas (IF/THEN) present a challenge for the model builder in terms of what is the optimal construction for the formula, without making it overly complicated and difficult to read or understand. The basic principle is to avoid making the calculation engine do more work than necessary. Try to set up the formula to finish the calculations as soon as possible.
Always put first the condition that is most likely to occur. That way the calculation engine can quit the processing of the expression at the earliest opportunity.
Here is an example that evaluates Seasonal Marketing Promotions:
The summer promotion runs for three months and the winter promotion for two months.
There are more months when there is no promotion, so this formula is not optimal and will take longer to calculate.
This is better as the formula will exit after the first condition more frequently.
There is an even better way to do this. Following the principles from above, add another line item for no promotion.
And then the formula can become:
This is even better because the calculation for No Promo has already been calculated and Summer Promo occurs more frequently than Winter Promo.
It is not always clear which condition will occur more frequently than others, but here are a few more examples of how to optimize formulas:
FINDITEM formula The Finditem element of a formula will work its way through the whole list looking for the text item and if it does not find the referenced text it will return blank. If the referenced text is blank it will also return a blank. Inserting a conditional expression at the beginning of the formula keeps the calculation engine from being overtaxed.
IF ISNOTBLANK(TEXT) THEN FINDITEM(LIST,TEXT) ELSE BLANK
IF BLANK(TEXT) THEN BLANK ELSE FINDITEM(LIST,TEXT)
Use the first expression if most of the referenced text contains data and the second expression if there are more blanks than data.
LAG, OFFSET, POST, etc. If in some situations there is no need to lag or offset data, for example if the lag or offset parameter is 0. The value of the calculation is the same as the period in question. Adding a conditional at the beginning of the formula will help eliminate unnecessary calculations:
IF lag_parameter = 0 THEN 0 ELSE LAG(Lineitem, lag_parameter, 0)
IF lag_parameter <> 0 THEN LAG(Lineitem, lag_parameter, 0) ELSE 0
The use of formula a or b will depend on the most likely occurrence of 0s in the lag parameter.
Booleans Avoid adding unnecessary clutter for line items formatted as BOOLEANS. There is no need to include the TRUE or FALSE expression, as the condition will evaluate to TRUE or FALSE.
IF Sales > 0 then TRUE ELSE FALSE
Dimension Order affects Calculation Performance
Ensuring consistency in the order of dimensions will help improve performance of your models. This consistency is relevant for modules and individual line items. Why does the order matter? Anaplan creates and uses indexes to perform calculations. Each cell in a module where dimensions intersect is given an index number.
Here are two simple modules dimensioned by Customer and Product. In the first module, Product comes first and Customer second and in the second module, Customer is first and Product second.
In this model, there is a third module that calculates revenue as Prices * Volumes.
Anaplan assigns indexes to the intersections in the module. Here are the index values for the two modules. Note that some of the intersections are indexed the same for both modules: Customer 1 and Product 1, Customer 2 and Product 2 and Customer 3 and Product 3, and that the remainder of the cells have a different index number. Customer 1 and Product 2 is indexed with the value of 4 in the top module and the value of 2 in the bottom module.
The calculation is Revenue = Price * Volume.
To run the calculation, Anaplan performs the following operations by matching the index values from the two modules.
Since the index values are not aligned the processor scans the index values to find a match before performing the calculation.
When the dimensions in the module are reordered, these are the index values:
The index values for each of the modules are now aligned. As the line-items of the same dimensional structure have an identical layout, the data is laid out linearly in memory. the calculation process accesses memory in a completely linear and predictable way. Anaplan’s microprocessors and memory sub-systems are optimized to recognise this pattern of access and to pre-emptively fetch the required data.
How does the dimension order become different between modules?. When you build a module, Anaplan uses the order that you drag the lists onto the Create Module dialog The order is also dependent on where the lists are added. The lists that you add to the pages area are first, then the lists that you add to the rows area, and finally the lists added to the columns area.
It is simple to re-order the lists and ensure consistency. Follow these steps:
On the Modules pane, (Model Settings>Modules) look for lists that are out of order in the Applies To column. Click the Applies To row that you want to re-order, then click the ellipsis.
In the Select Lists dialog, click OK.
In the Confirm dialog, click OK.
The lists will be in the order that they appear in General Lists.
When you have completed checking the list order in the modules, click the Line Items tab and check the line items. Follow steps 1 through 3 to re-order the lists.
Subsets and Line Item Subsets
One word of caution about Subsets and Line Item subsets. In the example below, we have added a subset and a Line Item Subset to the module:
The Applies To is as follows:
Clicking on the ellipsis, the dimensions are re-ordered to:
The general lists are listed in order first, followed by subsets and then line item subsets You still can re-order the dimensions by double clicking in the Applies to column and manually copying or typing the dimensions in the correct order.
The calculation performance relates to the common lists between the source(s) and the target. The order of separate lists in one or other doesn’t have any bearing on the calculation speed.
If you have a multi-year model where the data range for different parts of the model vary, (for example, history covering two years, current year forecast, and three planning years), then Time Ranges should be able to deliver significant gains in terms of model size and performance.
But, before you rush headlong into implementing Time Ranges across all of your models, let me share a few considerations to ensure you maximise the value of the feature and avoid any unwanted pitfalls.
As with all Anaplan models, there is no set naming convention, however we do advocate consistency and simplicity. As with lists and modules, short names are good. I like to describe the naming convention thus “as short as practical,” meaning you need to understand what it means, but don’t write an essay!
We recommend the using the following convention:
FYyy-FYyy. For example, FY16-FY18, or FY18 for a single year
Time Ranges available are from 1981 to 2079, so the “19” or the “20” prefixes are not strictly necessary. Keeping the name as short as this has a couple of advantages:
Clear indication of the boundaries for the Time Range
It is short enough to see the name of the Time Range in the module and line items blueprint
The aggregations available for Time Ranges can differ for each Time Range and also differ from the main model calendar. If you take advantage of this and have aggregations that differ from the model calendar, you should add a suffix to the description. For example:
FY16-FY19 Q (to signify Quarter totals)
FY16-FY19 QHY (Quarter and Half Year totals)
FY16-FY19 HY (Half Year totals only)
Time Ranges are Static
Time Ranges can span from 1981 to 2079. As a result, they can exist entirely outside, within, or overlap the model calendar. This means that there may likely be some additional manual maintenance to perform when the year changes. Let’s review a simple example:
Assume the model calendar is FY18 with 2 previous years and 2 future years; the model calendar spans FY16-FY20.
We have set up Time Ranges for historic data (FY16-FY17) and plan data (FY19-FY20)
We also have modules that use the model calendar to pull all of the history, forecast, and plan data together, as seen below:
At year end when we “roll over the model,” we amend the model calendar simply by amending the current year. What we have now is as follows:
You see that the history and plan Time Ranges are now out of sync with the model calendar.
How you change the history Time Range will depend on how much historic data you need or want to keep, but assuming you don’t need more than two year’s history, the Time Range should be re-named FY17-FY18 and the start period advanced to FY17 (from FY16).
Similarly, the plan Time Range should be renamed FY20-FY21 and advanced to FY20 (from FY19). FY18 is then available for the history to be populated and FY21 is available for plan data entry.
Time Ranges Pitfalls
Potential Data Loss
Time Ranges can bring massive space and calculation savings to your model(s), but be careful. In our example above, changing the Start Period of FY16-FY17 to FY17 would result in the data for FY16 being deleted for all line items using FY16-FY17 as a Time Range.
Before you implement a Time Range that is shorter or lies outside the current model calendar, and especially when implementing Time Ranges for the first time, ensure that the current data stored in the model is not needed. If in doubt, do some or all of the suggestions below:
Export out the data to a file
Copy the existing data on the line item(s) to other line items that are using the model calendar
Back up the whole model
The majority of the formulae will update automatically when updating Time Ranges. However, if you have any hard coded SELECT statements referencing years or months within the Time Range, you will have to amend or remove the formula before amending the Time Range. Hard coded SELECT statements go against best practice for exactly this reason; they cause additional maintenance. We recommend replacing the SELECT with a LOOKUP formulae from a Time Settings module.
There are other examples where the formulae may need to be removed/amended before the Time Range can be adjusted. See the Anapedia documentation for more details.
When to use the Model Calendar
This is a good question and one that we at Anaplan pondered during the development of the feature; Do Time Ranges make the model calendar redundant? Well, I think the answer is “no,” but as with so many constructs in Anaplan, the answer probably is “it depends!” For me, a big advantage of using the model calendar is that it is dynamic for the current year and the +/- years on either side. Change the current year and the model updates automatically along with any filters and calculations you have set up to reference current year periods, historic periods, future periods, etc.
(You are using a central time settings module, aren’t you??)
Time ranges don’t have that dynamism, so any changes to the year will need to be made for each Time Range. So, our advice before implementing Time Ranges for the first time is to review each Module and:
Assess the scope of the calculations
Think about the reduction Time Ranges will give in terms of space and calculation savings, but compare that with annual maintenance For example:
If you have a two-year model, with one history year (FY17) and the current year (FY18); you could set up a Time Range spanning one year for FY17 and another one year Time Range for FY18 and use these for the respective data sets. However, this would mean each year both Time Ranges would need to be updated.
We advocate building models logically, so it is likely that you will have groups of modules where Time Ranges will fall naturally. The majority of the modules should reflect the model calendar. Once Time Ranges are implemented, it may be that you can reduce the scope of the model calendar. If you have a potential Time Range that reflects either the current or future model calendar, leave the timescale as the default for those modules and line items; why make extra work?
As outlined above, we don’t advocate hard-coded time selects of the majority of time items because of the negative impact on maintenance (the exceptions being All Periods, YTD, YTG, and CurrentPeriod) When implementing Time Ranges for the first time, take the opportunity to review the line item formula with time selects. These formulae can be replaced with lookups using a Time Settings module.
Application Lifecycle Management (ALM) Considerations
As with the majority of the Time settings, Time Ranges are treated as structural data. If you are using ALM, all of the changes must be made in the Development model and synchronised to Production. This gives increased importance to refer to the pitfalls noted above to ensure data is not inadvertently deleted.
Best of luck! Refer to the Anapedia documentation for more detail. Please ask if you have any further questions and let us and your fellow Anaplanners know of the impact Time Ranges have had on your model(s).
There is an easy way to see to which dashboards a module has been published. This can be particularly helpful when you are making changes to a module and need to know which dashboards the changes could impact. It can also be useful to reduce sparsity by identifying modules that might not be needed within a model. In other words, if a module is not used for any dashboards you can check to see if it’s needed for anything else and if it’s not, eliminate it.
Have you ever wondered where, within a model, a list property is in use? The Referenced By property will tell you!
Within Model Settings select the desired list and click on the Properties tab.
From here just look for the column labeled Referenced By. It displays where the list is currently in use or being referenced.
This is especially useful if you want to edit or delete a property but you don’t know if it’s being used. Please note this same feature is available for list subsets.
Have you ever wondered where, within a model, a line item or line item subset is in use? The Referenced By property will tell you!
Open the model which contains the line item.
Toggle Blueprint mode on and look for the column labeled Referenced By. It displays where the line item is currently in use or being referenced.
This article provides the steps needed to create a basic time filter module. This module can be used as a point of reference for time filters across all modules and dashboards within a given model.
The benefits of a centralized Time Filter module include:
One centralized governance of time filters.
Optimization of workspace, since the filters do not need to be re-created for each view. Instead, use the Time Filter module.
Step 1: Create a new module with two dimensions—time and line items. The example below has simple examples for Weeks Only, Months Only, Quarters Only, and Years Only.
Step 2: Line items should be Boolean formatted and the time scale should be set in accordance to the scale identified in the line item name.
The example below also includes filters with and without summary methods, providing additional views depending on the level of aggregation desired.
Once your preliminary filters are set, your module will look something like the screenshot below.
Step 3: Use the pre-set Time Filters across various modules and dashboards. Simply click on the filters icon in the tool bar, navigate to the time tab, select your Time Filter module from the module selection screen, and select the line item of your choosing. Use multiple line items at a time to filter your module or dashboard view.
NOTE: The following information is also attached as a PDF for downloading and using off-line.
The process of designing a model will help you:
Understand the customer’s problem more completely
Bring to light any incorrect assumptions you may have made, allowing for correction before building begins
Provide the big picture view for building. (If you were working on an assembly line building fenders, wouldn’t it be helpful to see what the entire car looked like?)
Understand the requirements and the customer’s technical ecosystem when designing a model
When you begin a project, gather information and requirements using a number of tools. These include:
Statement of Work (SOW): Definition of the project scope and project objectives/high level requirements
Project Manifesto: Goal of the project – big picture view of what needs to be accomplished
IT ecosystem: Which systems will provide data to the model and which systems will receive data from the model? What is the Anaplan piece of the ecosystem?
Current business process: If the current process isn’t working, it needs to be fixed before design can start.
Business logic: What key pieces of business logic will be included in the model?
Is a distributed model needed?
High user concurrency
Security where the need is a separate model
Regional differences that are better handled by a separate model
Is the organization using ALM, requiring split or similar models to effectively manage development, testing, deployment, and maintenance of applications? (This functionality requires a premium subscription or above.)
User stories: These have been written by the client—more specifically, by the subject matter experts (SMEs) who will be using the model.
Why do this step?
To solve a problem, you must completely understand the current situation. Performing this step provides this information and the first steps toward the solution.
Results of this step:
Understand the goal of the project
Know the organizational structure and reporting relationships (hierarchies)
Know where data is coming from and have an idea of how much data clean-up might be needed
If any of the data is organized into categories (for example, product families) or what data relationships exist that need to be carried through to the model (for example, salespeople only sell certain products)
What lists currently exist and where are they are housed
Know which systems the model will either import from or export to
Know what security measures are expected
Know what time and version settings are needed
Document the user experience
Front to back design has been identified as the preferred method for model design. This approach puts the focus on the end user experience. We want that experience to align with the process so users can easily adapt to the model. During this step focus on:
User roles. Who are the users?
Identifing the business process that will be done in Anaplan.
Reviewing and documenting the process for each role.
The main steps. If available, utilize user stories to map the process. You can document this in any way that works for you. Here is a step-by-step process you can try:
What are the start and end points of the process?
What is the result or output of the process? What does each role need to see/do in the process?
What are the process inputs and where do they come from?
What are the activities the user needs to engage in? Verb/object—approve request, enter sales amount, etc. Do not organize during this step. Use post-its to capture them.
Take the activities from step 4 and put them in the correct sequence.
Are there different roles for any of these activities? If no, continue with step 8. If yes, assign a role to each activity.
Transcribe process using PowerPoint ® or Lucid charts. If there are multiple roles, use swim lanes to identify the roles.
Check with SMEs to ensure accuracy.
Once the user process has been mapped out, do a high level design of the dashboards
Include: Information needed
What data does the user need to see?
What the user is expected to do or decisions that the user makes
Share the dashboards with the SMEs. Does the process flow align?
Why do this step?
This is probably the most important step in the model design process. It may seem as though it is too early to think about the user experience, but ultimately the information or data that the user needs to make a good business decision is what drives the entire structure of the model.
On some projects, you may be working with a project manager or a business consultant to flesh out the business process for the user. You may have user stories, or it may be that you are working on design earlier in the process and the user stories haven’t been written. In any case, identify the user roles, the business process that will be completed in Anaplan, and create a high level design of the dashboards. Verify those dashboards with the users to ensure that you have the correct starting point for the next step.
Results of this step:
List of user roles
Process steps for each user role
High level dashboard design for each user role
Use the designed dashboards to determine what output modules are necessary
Here are some questions to help you think through the definition of your output modules:
What information (and in what format) does the user need to make a decision?
If the dashboard is for reporting purposes, what information is required?
If the module is to be used to add data, what data will be added and how will it be used?
Are there modules that will serve to move data to another system? What data and in what format is necessary?
Why do this step?
These modules are necessary for supporting the dashboards or exporting to another system. This is what should guide your design—all of the inputs and drivers added to the design are added with the purpose of providing these output modules with the information needed for the dashboards or export.
Results of this step:
List of outputs and desired format needed for each dashboard
Determine what modules are needed to transform inputs to the data needed for outputs
Typically, the data at the input stage requires some transformation. This is where business rules, logic, and/or formulas come into play:
Some modules will be used to translate data from the data hub. Data is imported into the data hub without properties, and modules are used to import the properties. Reconciliation of items takes place before importing the data into the spoke model.
These are driver modules that include business logic, rules.
Why do this step?
Your model must translate data from the input to what is needed for the output
Results of this step:
Business rules/calculations needed
Create a model schema
You can whiteboard your schema, but at some point in your design process, your schema must be captured in an electronic format. It is one of the required pieces of documentation for the project and is also used during the Model Design Check-in, where a peer checks over your model and provides feedback.
Identify the inputs, outputs, and drivers for each functional area
Identify the lists used in each functional area
Show the data flow between the functional areas
Identify time and versions where appropriate
Why do this step?
It is required as part of The Anaplan Way process. You will build your model design skills by participating in a Model Design Check-in, which allows you to talk through the tougher parts of design with a peer.
More importantly, designing your model using a schema means that you must think through all of the information you have about the current situation, how it all ties together, and how you will get to that experience that meets the exact needs of the end user without fuss or bother.
Result of this step:
Model schema that provides the big picture view of the solution. It should include imports from other systems or flat files, the modules or functional areas that are needed to take the data from current state to what is needed to support the dashboards that were identified in Step 2. Time and versions should be noted where required. Include the lists that will be used in the functional areas/modules.
Your schema will be used to communicate your design to the customer, model builders, and others. While you do not need to include calculations and business logic in the schema, it is important that you understand the state of the data going into a module, the changes or calculations that are performed in the module and the state of the data leaving the module, so that you can effectively explain the schema to others.
For more information, check out 351 Schemas. This 10 to 15 minute course provides basic information about creating a model schema.
Verify that the schema aligns with basic design principles
When your schema is complete, give it a final check to ensure:
It is simple.
“Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius — and a lot of courage to move in the opposite direction.” ― Ernst F. Schumacher
“Design should be easy in the sense that every step should be obviously and clearly identifiable. Simplify elements to make change simple so you can manage the technical risk.” — Kent Beck
The model aligns with the manifesto.
The business process is defined and works well within the model.
A large and complex model such as 10B cells can take 10 minutes to load the first time it's in use after a period of inactivity of 30mn.
The only way to reduce the load time, besides reducing the model size, is by identifying what formula takes most of the time. This requires the Anaplan L3 support, but you can reduce the time yourself by applying the formula best practices listed above.
One other possible leverage is on list setup: Text properties on a list can increase the load times and subsets on lists can disproportionately increase load times by up to 10 times. See if you can impact the model load on reviewing these 2 and use module line item instead.
A model will save when the amount of changes made by end-users exceeds a certain threshold. This action can take several minutes and will be a blocking operation. Administrator have no leverage on model save besides formula optimization and model size reducing.
A model will roll back in some cases of invalid formula, or when a model builder attempts to create a process, an import a view which name already exists. In some large implementation cases, on a complex model made of 8B+ cells, the rollback takes approximately the time to open the model, and up to 10 minutes worth of accumulated changes, followed by a model save.
The recommendation is to use ALM and have a DEV model which size does not exceed 500M cells, with production list limited to a few dozen items, and have TEST and PROD model with the full size and large lists. Since no formula editing will happen in TEST or PROD, the model will never rollback after a user action. It can roll back on the DEV model, but will take a few seconds only if the model is small.
Details of known issues
PERFORMANCE ISSUES WITH LONG NESTED FORMULA
Need to have a long formula on time as a result of nested intermediate calculations.
If model size does not prevent from adding extra line items, it's a better practice to create multiple intermediate line items and reduce the size of the formula, as opposed to nesting all intermediate calcs into one gigantic formula.
This applies to summary formulae (SUM, LOOKUP, SELECT).
Combining SUM and LOOKUP in the same line item formula can cause performance issues in some cases. If you have noticed a drop in performance after adding a combined SUM and LOOKUP to a single line item, then split it into two line items.
RANKCUMULATE CAUSES SLOWNESS
A current issue with the RANKCUMULATE formula can mean that the time to open the model incl. rollback, times can be up to 5 times slower than they should be.
There is currently no suitable workaround, our recommendations are to stay within the constraints defined in Anapedia.
SUM/LOOKUP WITH LARGE CELL COUNT
Separate formulas into different line items to reduce calculation time (fewer cells need to recalculate parts of a formula that would only affect a subset of the data)
A known issue with SUM/LOOKUP combinations within a formula can lead to slow model open and calculation times, particularly if the line item has a large cell count.
All line items do not apply to time or versions.
Y = X[SUM: R, LOOKUP: R]
Y Applies to [A,B]
X Applies to [A,B]
R Applies to [B] list formatted [C]
Add a new line item 'intermediate' that must have 'Applies To' set to the 'Format' of 'R'
intermediate = X[SUM: R]
Y = intermediate[LOOKUP: R]
This issue is currently being worked on by Development and a fix will be available in a future release
Calculations are over non common dimensions
Anaplan calculates quicker if calculations are over common dimensions. Again, best seen in an example. If you have, List W, X Y = A + B Y Applies To W, X A Applies To W B Applies To W This performs slower than, Y = Intermediate Intermediate = A + B Intermediate Applies To W All other dimensions same as above. Similarly, you can substitute A & B above for a formula, e.g. SUM/LOOKUP calculations.
CELL HISTORY TRUNCATED
Currently history generation has a time limit of 60 seconds set. The history generation is split into 3 stages with 1/3 of time allocated to each.
The first stage is to build a list of columns required for the grid - this involves reading all the history - if this takes more than 20 seconds then the user receives the message "history truncated after x seconds - please modify the date range", where x is how many seconds it took. No history is generated.
If the first stage completes within 20s it goes on to generate the full list of history.
In the grid only the first 1000 rows are displayed, the user must Export history to get full history. This can take significant time depending on volume.
The same steps are taken for model and cell history. The cell history is generated from loading the entire model history and searching through the history for the relevant cell information. When the model history gets too large then it is currently truncated to prevent performance issues, unfortunately this can make it impossible to retrieve the cell history that is needed.
Make it Real time when needed
Do not make it real time unless it needs to be.
By this, we mean do not have line items where users input data being referenced by other line items, unless they have to be. A way around this could be to have users have their data input sections which is not referenced anywhere, or as little as possible and, say, at the end of the day when no users are in the model, run an import which would update into cells where calculations are then done. This may not always be possible if the end user needs to see resulting calculations from his inputs, but if you can limit these to just do the calculations that he needs to see and use imports during quiet times then this will still help.
We see this often when not all reporting modules need to be recalculated real time. In many cases, many of these modules are good to be calculated the day after.
Don't have line items that are dependent on other line items unnecessarily - this can cause Anaplan to not utilize the maximum number of calculations it can do at once. This happens where a line items formula cannot be calculated because is it waiting on results of other line items. A basic example of this can be seen with line item's A, B, and C having the formulas: A - no formula B= A C = B Here B would be calculated, and then C would be calculated after this. Whereas if the setup was: A - no formula B = A C = A Here B and C can be calculated at the same time. This also helps if line item B is not needed it can then be removed, further reducing the amount of calculations and the size of the model. This needs to considered on a case by case basis and is a tradeoff between duplicating calculations and utilizing as many threads as possible - if line item B was referenced by a few other line items, it may indeed be quicker to have this line item.
Summary cells often take processing time even if they are not actually recalculated because they must check all the lower level cells.
Reduce summaries to ‘None’ where ever possible. This not only reduces aggregations but the size of the model
A dashboard with grids that include large lists that have been filtered and/or sorted can take time to open. The opening action can also become a blocking operation; when this happens, you'll see the blue toaster box showing "Processing....." when the dashboard is opening. This article includes some guidelines to help you avoid this situation.
Rule 1: Filter large lists by creating a Boolean line item
Avoid the use of filters on text or non-Boolean formatted items for large lists on the dashboard. Instead, create a line item with the format type Boolean and add calculations to the line item so that the results return the same data set as the filter would.
This is especially helpful if you implement user-base filters, where the Boolean will be by user, and by the list to be filtered.
The memory footprint of a Boolean line item is 8x smaller than other types of line items.
Warning on a known issue: On an existing dashboard where a saved view is being modified by replacing the filters with a Boolean line item for filtering, you must republish it to the dashboard. Simply removing the filters from the published dashboard will not improve performance.
Rule 2: Use the default Sort
Use sort carefully, especially on large list. Opening a dashboard that has a grid where a large list is sorted on a text formatted line item will likely take 10 seconds or more and may be a blocking operation.
To avoid using the sort: Your list is (by default) sorted by the criteria you need. If it is not sorted, you can still make the grid usable by reducing the items using a user-based filter.
Rule 3: Reduce the amount of dashboard components
There are times when the dashboard includes too many components, which slows performance. A reasonably large dashboard is no wider than 1.5 page (avoiding too much horizontal scrolling) and 3 pages deep. Once you exceed these limits, consider moving the components into multiple dashboards. Doing so will help both performance and usability.
Rule 4: Avoid using large lists as page selectors
If you have a large list and use it as a page selector on a dashboard, that dashboard will open slowly. It may take10 seconds or more. The loading of the page selector takes more than 90% of the total time.
Known issue / This is how Anaplan works: If a dashboard grid contains list formatted line items, the contents of page selector drop-downs are automatically downloaded until the size of the list meets a certain threshold; once this size is exceeded, the download happens on demand, or in other words, when a user clicks the drop down. The issue is that when Anaplan requests the contents of list formatted cell drop-downs, it also requests contents of ALL other drop-downs INCLUDING page selectors.
Recommendation: Limit the page selectors on medium to large lists using the following tips:
a) Make the page selector available in one grid and use the synchronized paging option for all other grids and charts. No need to allow users to edit the page in every dashboard grid or chart.
b) If you have a large list, it makes for a poor user experience, as there is no search available. Using a large list as a page selector creates both a performance and a usability issue.
Solution 1: Design a dashboard dedicated to searching a line item:
From the original dashboard (where you wanted to include the large list page selector), the user clicks a custom search button that opens a dashboard where the large list is displayed as the rows of a grid.
The user can then use a search to find the item needed. If possible, implement user-based filters to help the user further reduce the list and quickly find the item.
The user highlights the item found, closes the tab, and returns to the original dashboard where all grids are set on the highlighted item.
Alternate solution: If the dashboard elements don't require the use of the list, you should publish them from a module that doesn't contain this list. For example, floating page selectors for time or versions, or grids that are displayed as rows/columns-only should be published from modules that does not include the list.
Why? The view definitions for these elements will contain all the source module's dimensions, even if they are not shown, and so will carry the overhead of populating the large page selector if it was present in the source.
Imports are blocking operation: The model is locked during the time of the import, and concurrent imports run by end-user will need to run one after the other, and will block the model for everyone else.
Rule 1: Carefully decide if you let end-user import (and export) during business hours
Imports executed by end-users should be carefully considered, and if possible executed once or twice a day. Customer easily accept model freeze at scheduled hours for a predefined time even if it takes 10+ minutes, and are frustrated when these imports are run randomly during business hours by anyone.
Your first optimization is to adjust the process and run these imports by an admin, at scheduled time and let the user based know about the schedule.
Rule 2: Mapping Objective = zero errors or warning
Make sure your import returns with no errors or warning, every error takes processing time. Time to import into a medium to large list (>50k) is significantly reduced if no errors are to be processed. Here are the tips to reduce errors:
Always import from a saved view - NEVER from the default view. And use the naming convention for easy maintenance
Hide the line items that are not needed for import, do not bring extra columns that are not needed.
In the import definition, always map all displayed line items (source→target) or use the "ignore" setting - don't leave any line item unmapped
Rule 3: Watch the formulas recalculated during the import
If your end-users encounter poor performance when clicking a button that triggers an import or a process, it is likely due to the recalculations that is triggered by the import, especially if the action creates or moves items within a hierarchy.
You will likely need the help of Anaplan support (L3) to identify what formulas are triggered after the import is done, and get a performance check on these formulas to identify which one takes most of the time. Usually those fetching many cells such as SUM, ANY or FINDITEM() are likely to be responsible for the performance impact.
To solve such situations, you will need to challenge the need of recalculating the formula identified each time a user calls the action.
Often, for actions such as creations, moves, assignment done in WFP or Territory Planning, many calculations used for Reporting are triggered in real-time after the hierarchy is modified by the import, and are not necessarily needed by users.
the recommendation is to challenge your customer and see if these formulas couldn't be calculated only once a day, instead of each a user runs the action. If yes, you'll need to rearchitect your modules so that these heavy formulas get to run through a different process run daily by an admin, and not by each end-users.
Rule 4: Import List properties
Importing list properties takes more time than importing these as module line item. Review your model list impacted by imports, and envision replacing list properties by module line items when possible.
Also, please refer to the Data Hub best practices, where we recommend to upload all list properties into a Data HUB module and not in the list property itself.
Rule 5: Get your Data HUB
HUB and SPOKE: Setup a HUB data model, which will feed the other production model used by stakeholders.
Look at the white paper on how to build a Data HUB:
It will prevent production models to be blocked by a large import from External Data source. But since Data HUB to Production model imports will still be blocking operations, carefully filter what you import, and use the best practices rules listed above.
All import, mapping/transformation modules required to prepare the data to be loaded into Planning modules can now be located in a dedicated Data HUB model and not in the Planning model. This model will then be smaller and will work more efficiently
Reminder of the other Benefits not linked to performance:
Better structure, easier maintenance: Data HUB help keep all the data organized in a central location.
Better governance: Whenever possible put this Data HUB on a different WS. That will ease the separation of duties between Production models and Meta Data management, at least on Actual Data and production lists. IT department will love the idea to own the Data HUB, and have no one else be an admin in the WS
Lower implementation costs: Data HUB is a way to reduce the implementation time of new projects. Assuming IT can load the data needed by the new project in the Data HUB, then business users do not have to integrated with complex source system, but with the Anaplan Data HUB instead.
Rule 6: Incremental import/Export
This can be the magic bullet in some cases. If you export on a frequent basis (daily ot more) from Anaplan model into a reporting system, or write back to the source system, or simply transfer data from one Anaplan model to another, you have ways to only import/exports the data that have changed since the last export.
Use the concatenation + Change boolean technique explained in the Data HUB white paper.
First, the bigger your model is the more performance issues you are likely to experience. So a best practice is to use all the possible tools & features we have to make the model as small and dense as possible. This includes:
Line Item Checks: summary calculations, dimensionality used
Line Item Duplication
Granularity of Hierarchies
Use of subsets and line item subsets
More information on eliminating sparsity can be found in Learning Center courses 309 and 310.
General recommendations also include whenever possible, challenging your customer’s business requirements when customer require large list (>1M), big data history and high number of dimensions used at the same time for a line item (>5)
Once these general and basic sparsity recommendations have been applied, you can further performance in different areas. The articles below will expand on each subject:
Imports and exports and their effect model performance
Rule 1: Carefully decide if you let end-user import (and export) during business hours
Rule 2: Mapping Objective = zero errors or warning
Rule 3: Watch the formulas recalculated during the import
Rule 4: Import List properties
Rule 5: Get your Data HUB
Rule 6: Incremental import/Export
Dashboard settings that can help improve model performance
Rule1: Large list = Filter these on a boolean, not on text
Rule 2: Use the default Sort
Rule 3: Reduce the amount of dashboard component
Rule 4: Watch large page drop downs
Formulas and their effect on model performance
Model load, Model Save, Model Rollback and their effect on model performance
User roles and their effect on model performance
When user roles are given access to lists (for edit), memory is pre-allocated for those users to increase model size. Give user role access only to the lists that they will possibly update through actions.
You should create user roles for each business function. You should then apply Selective Access to all lists, which helps to control the access that each end user needs. Avoid creating different roles, with varying access rights, for the same type of end user. This will help avoid the need for additional model maintenance.
Sort roles in a sensible fashion using the Reorder button (e.g. most privileges, some privileges, least privileges).
Consider using a module to control user access. This will allow model builders to provide clear instructions on the roles and access rights in the model, along with the ability to change user access rights from a convenient dashboard. Additionally, you can create an import and run it as part of a process to import user access from this module. Note that only model builders will have access to import data into the user list.
More information on User Roles and Selective Access can be found in Learning Center under Advanced Topics.
In most use cases, a single model provides the solution you are seeking, but there are times it makes sense to separate, or distribute, models rather than have them in a single instance. The following articles provide insight that can help you during the design process to determine if a distributed model is needed.
What is Application Lifecycle Management (ALM)?
What types of distributed models are there?
When should I consider a distrbuted model?
How do changes to the primary model impact distributed models?
What should I do after building a distributed model?
It is important to understand what Application Lifecycle Management, or ALM, enables clients to do within Anaplan.
In short, ALM enables clients to effectively manage the development, testing, deployment, and ongoing maintenance of applications in Anaplan. With ALM, it is possible to introduce changes without disrupting business operations by securely and efficiently managing and updating your applications with governance across different environments and quickly deploying changes to run more “what-if” scenarios in your planning cycles as you test and release development changes into production.
Learn more here: Understanding model synchronization in Anaplan ALM
Training on ALM is also available in the Education section 313 Application Lifecycle Management (ALM)
There are two different types of distributed models to consider as early as possible when a client chooses to implement Anaplan:
A split model is where one model, known as the primary model, is partitioned into multiple satellite models that contain the exact same structure or metadata (such as versions and dimensions) as the primary model. The split models will be 90% identical to the primary model and will have about a 10% difference. The split model method is most common when a client's workspace involves multiple regions.
For example, the primary model may contain three different product lines. Region 1 sells product lines A and B, while Region 2 sells only product C. In this case, a split model may provide consistency in structure across the models, but variation with the product lines since not all product lists are applicable to each region.
ALM application: Split models
For split models, ALM allows clients to maintain the primary model as well as all satellite models in their workspace using one development model. Clients may make changes to their development model, and then deploy updates to their live models without disrupting the application cycle.
Similar models are models that vary slightly in structure or metadata. The degree of difference is usually less than 5%. If it gets to be greater than this, or there’s a greater difference in user experiences, it may be impractical to use similar models. For example, you could use the similar models method if you have multiple regions that must view the same data, ideally from a master data hub.
ALM spplication: Similar models
For similar models, ALM requires clients to maintain one development model for each similar model in use. Comparable to split models, each development model may be edited, tested, and then deployed to the production model without disrupting the application cycle.
In many situations, enterprises need to split very large and complex models for various reasons including:
Performance issues, including data volume and user concurrency
Metadata time cycle differences
Regional / business process differences
Anaplan is a platform designed to enable businesses to build models in almost endless configurations, so there is no pre-set size recommendation for where a model can be distributed. It is not uncommon for a 15-billion-cell model performing complex calculations to remain a single model, used by only a single person or just a few people. However, in contrast to that, it is also not uncommon to have a distributed model as small as 1 billion cells, with complex calculations and multiple people in multiple locations using the model.
As a general guide, this table takes into consideration the factors that influence a single model or distributed model solution.
Large Data Volumes, (> 10GB)*
High User Concurrency*
Sample Model 1
Sample Model 2
Sample Model 3
Depends on actual volume
Sample Model 4
Sample Model 5
Depends on actual volume
Sample Model 6
Depends on user concurrency
Sample Model 7
Depends on actual volume
* As always, apply appropriate testing and tuning to optimize the model. Different combinations can have a dramatic effect on desired performance and experience.
Anaplan has robust security across its platform. In some cases, it’s possible to achieve region-specific experiences using selective access. If this is the case, then distributed models are not necessary. But in mixed environments where model builders and end users operate in the same model, and where various business processes exist, at times it makes sense to separate or distribute models rather than have them in a single instance. For example, you may have different countries that all need access to a workforce planning application. You also have model builders from each country modeling and maintaining their section. By distributing the models and restricting access, this problem is abated.
Note: Where there is a need to segregate administration (model builder) roles, the split models will need to be in different workspaces, as the admin role is by workspace, not by model.
Metadata time cycle differences
A single instance of a model serving the world across multiple time zones does not respect the different business cycles involved, and therefore updates to data and/or metadata of a model will affect the entire community, some of whom may be in the middle of their planning cycle. These changes may be small, but in many instances are large-scale and frequent changes, which require pauses in the application cycle for end users.
However, a configuration that does respect business cycles and time zones and distributes the model can be beneficial to the business as business regions that are in down-time (e.g., in the middle of their night, where usage is very low) can, independently, carry out updates to data and metadata without affecting other regions.
ALM application: Metadata time cycle differences
Alternatively, ALM prevents pauses in the application cycle altogether by providing a development environment for each model. You may edit development models at any time without disrupting live production models for end users. Then, once you have completed your edits on the development model, you may deploy them to live production models without any disruptions or down-time for end users. As a result, using ALM removes any risk for pauses in the application cycle for any user at any time.
Regional / business process differences
Similar to the workforce planning example above, regional differences may exist. It may not be practical to attempt to include all regional variances that exist across countries for workforce planning in a single instance. Much of the functionality would not be relevant to every region, and so confusion and frustration would occur, as well as complication of user interface. In this instance a distributed model would be the best solution.
Another consideration is that of differing business processes. That is to say, both processes are intrinsically the same, but different enough to warrant separate treatment and business processes that are completely different.
An example of this may be a process where a business updates a forecast. Perhaps they get to the same point in a revenue forecast, but how different parts or divisions of a business get to that point is different. One may do an initial bottom-up forecast, submit up to management for draft approval, and then do a final submit. Another may do a top-down approach where they set a target and that target needs to be validated. These are connected, yet separate, processes that may warrant separate instances of an application.
ALM application: Regional / business process differences
If regional and business processes are similar between satellite models, and the metadata between them can be synced from a single development (primary) model, then ALM can be used to develop, test, and produce the single development model that feeds the satellite models.
If the regional and/or business processes cannot conform to use the same metadata from a single development model, then multiple development models must be used. In this case, ALM would be used to update, test, and produce each development model, which would then feed into each respective satellite model.