PLANS is the new standard for Anaplan modelling; “the way we model”. This will cover more than just the formulas and will include and evolve existing best practices around user experience and data hubs. The initial focus is to develop a set of rules on the structure and detailed design of Anaplan models. This set of rules will provide both a clear route to good model design for the individual Anaplanner, and common guidance on which Anaplanners and reviewers can rely when passing models amongst themselves.
In defining the standard, everything we do will consider or be based around:
Performance – Use the correct structures and formulae to optimize the Hyperblock
Logical – Build the models and formulae more logically – See D.I.S.C.O below
Auditable – Break up formulae for better understanding, performance and maintainability
Necessary – Don’t duplicate expressions, reference data once, no unnecessary calculations
Sustainable – Build with the future in mind, think about process cycles and updates
The standards will be based around three axes:
Performance - How do the structures and formulae impact the performance of the system?
Usability/Auditability - Is the user able to understand how to interact with the functionality?
Sustainability - Can the solution be easily maintained by model builders and support?
We will define the techniques to use that balance the three areas to ensure the optimal design of Anaplan models and architecture
As part of model and module design we recommend categorizing modules as follows:
Data – Data hubs, transactional modules, source data; reference everywhere
Inputs – Design for user entry, minimize the mix of calculations and output
System – Time management, filters, mappings etc.; reference everywhere
Calculations – Optimize for performance (turn summaries off, combine structures)
Outputs - Reporting modules, minimize data flows out
The Anaplan Optimizer aids business planning and decision making by solving complex problems involving millions of combinations quickly to provide a feasible solution.
Optimization provides a solution for selected variables within your Anaplan model that matches your objective based on your defined constraints. The Anaplan model must be structured and formatted to enable Optimizer to produce the correct solution.
You are welcome to read through the materials and watch the videos on this page, but Optimizer is a premium service offered by Anaplan (Contact your Account Executive if you don't see Optimizer as an action on the settings tab). This means that you will not be able to actually do the training exercises until the feature is turned on in your system.
The training involves an exercise along with documentation and videos to help you complete it.
The goal of the exercise is to setup the optimization exercise for two use cases; network optimization and production optimization. To assist you in this process we have created an optimization exercise guide document which will walk you through each of the steps. To further help we have created three videos you can reference:
An exercise walk-through
A demo of each use case
A demo of setting up dynamic time
Follow the order of the items listed below to assist with understanding how Anaplan's optimization process works:
Watch the use case video which demos the Optimizer functionality in Anaplan
Watch the exercise walkthrough video
Review documentation about how Optimizer works within Anaplan
Attempt the Optimizer exercise
Download the exercise walkthrough document
Download the Optimizer model into your workspace
How to configure Dynamic Time within Optimizer
Download the Dynamic Time document
Watch the Dynamic Time video
Attempt Network Optimization exercise
Attempt Production Optimization exercise
Dynamic Cell Access (DCA) controls the access levels for line items within modules. It is simple to implement and provides modelers with a flexible way of controlling user inputs. Here are a few tips and tricks to help you implement DCA effectively.
Access control Modules
Any line item can be controlled by any other applicable Boolean line item. To avoid confusion over which line item(s) to use, it is recommended that you add a separate functional area, and create specific modules to hold the driver line items. These modules should be named appropriately (e.g. Access – Customers > Products, or Access – Time etc.). The advantage of this approach is the access driver can be used for multiple line items or modules and the calculation logic is in one place. In most cases, you will probably want read and write access. Therefore, within each module it is recommended that you add two line items (Write? and Read?). If the logic is being set for Write?, then set the formulas for the Read? line item to NOT WRITE? (or vice-versa). It may be necessary to add multiple line items to use for different target line items, but start with this a default.
You may not need to create a module that mirrors the dimensionality of the line item you wish to control. For example, if you have a line item dimensioned by customer, product, and time, and you wish to make actual months read only, you can use an access module just dimensioned by time. Think about what dimension the control needs to apply to and create an access module accordingly.
What settings do I need?
There are three different states of access that can be applied: READ, WRITE, and INVISIBLE or hidden. There are two blueprint controls (read control and write control) and there are two states for a driver (TRUE or FALSE). The combination of these determines which state is applied to the line item. The following table illustrates the options:
Only the read access driver is set:
Read Access Driver
Target Line Item
Only the write access driver is set:
Write Access Driver
Target Line Item
Both read access and write access drivers are set:
Read Access Driver
Write Access Driver
Target Line Item
Revert to Read*
*When both access drivers are set, the write access driver takes precedence with write access granted if the status of the write access driver is true. If the status of the write access driver is false, the cell access is then taken from the read access driver status.
The settings can also be expressed in the following table:
WRITE ACCESS DRIVER
READ ACCESS DRIVER
Note: If you want to have read and write access, it is necessary to set both access drivers within the module blueprint.
Think about how you want the totals to appear. When you create a Boolean line item, the default summary option is NONE. This means that if you used this access driver line item, any totals within the target would be invisible. In most cases you will probably want the totals to be read only, so setting the access driver line item summary to ANY will provide this setting. If you are using the Invisible setting to “hide” certain items and you do not want the end user to compute hidden values, then it is best to use the ANY setting for the access driver line item. This means that only if all values in the list are visible then the totals show; otherwise the totals are hidden from view.
Thinking through the results of a modeling decision is a key part of ensuring good model performance—in other words, making sure the calculation engine isn’t overtaxed. This article highlights some ideas for how to lessen the load on the calculation engine.
Formulas should be simple; a formula that is nested or uses multiple combinations uses valuable processing time. Writing a long, involved formula makes the engine work hard. Seconds count when the user is staring at the screen. Simple is better. Breaking up formulas and using other options helps keep processing speeds fast.
You must keep a balance when using these techniques in your models, so the guidance is as follows:
Break up the most commonly changed formula
Break up the most complex formula
Break up any formula you can’t explain the purpose of in one sentence
Formulas with many calculated components
The structure of a formula can have a significant bearing on the amount of calculation that happens when inputs in the model are changed. Consider the following example of a calculation for the Total Profit in an application. There are five elements that make up the calculation: Product Sales, Service Sales, Cost of Goods Sold (COGS), Operating Expenditure (Op EX), and Rent and Utilities. Each of the different elements are calculated in a separate module. A reporting module pulls the results together into the Total Profit line item, which is calculated using the formula shown below. What happens when one of the components of COGS changes? Since all the source components are included in the formula, when anything within any of the components changes, this formula is recalculated. If there are a significant number of component expressions, this can put a larger overhead on the calculation engine than is necessary.
There is a simple way to structure the module to lessen the demand on the calculation engine. You can separate the input lines in the reporting module by creating a line item for each of the components and adding the Total Profit formula as a separate line item. This way, changes to the source data only cause the relevant line item to recalculate.
For example, a change in the Product Sales calculation only affects the Product Sales and the Total Profit line items in the Reporting module; Services Sales, Op EX, COGS and Rent & Utilities are unchanged. Similarly, a change in COGS only affects COGS and Total Profit in the Reporting module.
Keep the general guidelines in mind. It is not practical to have every downstream formula broken out into individual line items.
Plan to provide early exits from formulas
Conditional formulas (IF/THEN) present a challenge for the model builder in terms of what is the optimal construction for the formula, without making it overly complicated and difficult to read or understand. The basic principle is to avoid making the calculation engine do more work than necessary. Try to set up the formula to finish the calculations as soon as possible.
Always put first the condition that is most likely to occur. That way the calculation engine can quit the processing of the expression at the earliest opportunity.
Here is an example that evaluates Seasonal Marketing Promotions:
The summer promotion runs for three months and the winter promotion for two months.
There are more months when there is no promotion, so this formula is not optimal and will take longer to calculate.
This is better as the formula will exit after the first condition more frequently.
There is an even better way to do this. Following the principles from above, add another line item for no promotion.
And then the formula can become:
This is even better because the calculation for No Promo has already been calculated and Summer Promo occurs more frequently than Winter Promo.
It is not always clear which condition will occur more frequently than others, but here are a few more examples of how to optimize formulas:
FINDITEM formula The Finditem element of a formula will work its way through the whole list looking for the text item and if it does not find the referenced text it will return blank. If the referenced text is blank it will also return a blank. Inserting a conditional expression at the beginning of the formula keeps the calculation engine from being overtaxed.
IF ISNOTBLANK(TEXT) THEN FINDITEM(LIST,TEXT) ELSE BLANK
IF BLANK(TEXT) THEN BLANK ELSE FINDITEM(LIST,TEXT)
Use the first expression if most of the referenced text contains data and the second expression if there are more blanks than data.
LAG, OFFSET, POST, etc. If in some situations there is no need to lag or offset data, for example if the lag or offset parameter is 0. The value of the calculation is the same as the period in question. Adding a conditional at the beginning of the formula will help eliminate unnecessary calculations:
IF lag_parameter = 0 THEN 0 ELSE LAG(Lineitem, lag_parameter, 0)
IF lag_parameter <> 0 THEN LAG(Lineitem, lag_parameter, 0) ELSE 0
The use of formula a or b will depend on the most likely occurrence of 0s in the lag parameter.
Booleans Avoid adding unnecessary clutter for line items formatted as BOOLEANS. There is no need to include the TRUE or FALSE expression, as the condition will evaluate to TRUE or FALSE.
IF Sales > 0 then TRUE ELSE FALSE
Reducing the number of calculations will lead to quicker calculations and improve performance. But this doesn’t mean combining all your calculations into fewer line items, as breaking calculations into smaller parts has major benefits for performance.
Learn more about this in the Formula Structure article.
How is it possible to reduce the number of calculations? Here are three easy methods:
Turn off unnecessary Summary method calculations.
Avoid formula repetition by creating modules to hold formulas that are used multiple times.
Ensure that you are not including more dimensions than necessary in your calculations.
Turn off Summary method calculations
Model builders often include summaries in a model without fully thinking through if they are necessary. In many cases the summaries can be eliminated. Before we get to how to eliminate them, let’s recap on how the Anaplan engine calculates.
In the following example we have a Sales Volume line-item that varies by the following hierarchies:
This means that from the detail values at SKU, City, and Channel level, Anaplan calculates and holds all 23 of the aggregate combinations shown below—24 blocks in total.
With the Summary options set to Sum, when a detailed item is amended (represented in the grey block), all the other aggregations in the hierarchies are also re-calculated. Selecting the None summary option means that no calculations happen when the detail item changes. The varying levels of hierarchies are quite often only there to ease navigation and the roll-up calculations are not actually needed, so there may be a number of redundant calculations being performed. The native summing of Anaplan is a faster option, but if all the levels are not needed it might be better to turn off the summary calculations and use a SUM formula instead.
For example, from the structure above, let’s assume that we have a detailed calculation for SKU, City, and Channel (SALES06.Final Volume). Let’s also assume we need a summary report by Region and Product, and we have a module (REP01) and a line item (Volume) dimensioned as such.
REP01.Volume = SALES06 Volume Calculation.Final Volume is replaced with REP01.Volume = SALES06.Final Volume[SUM:H01 SKU Details.Product, SUM:H02 City Details.Region]
The second formula replaces the native summing in Anaplan with only the required calculations in the hierarchy.
How do you know if you need the summary calculations? Look for the following:
Is the calculation or module user-facing?
If it is presented on a dashboard, then it is likely that the summaries will be needed. However, look at the dashboard views used. A summary module is often included on a dashboard with a detail module below; effectively the hierarchy sub-totals are shown in the summary module, so the detail module doesn’t need the sum or all the summary calculations.
Detail to Detail
Is the line item referenced by another detailed calculation line item? This is very common, and if the line item is referenced by another detailed calculation the summary option is usually not required. Check the Referenced by column and see if there is anything referencing the line item.
Calculation and staging modules
If you have used the DISCO module design, you should have calculation/staging modules. These are often not user-facing and have many detailed calculations included in them. They also often contain large cell counts, which will be reduced if the summary options are turned off.
Can you have different summaries for time and lists?
The default option for Time Summaries is to be the same as the lists. You may only need the totals for hierarchies, or just for the timescales. Again, look at the downstream formulas.
The best practice advice is to turn off the summaries when you create a line item, particularly if the line item is within a Calculation module (from the DISCO design principles).
Avoid Formula Repetition
An optimal model will only perform a specific calculation once. Repeating the same formula expression multiple times will mean that the calculation is performed multiple times. Model builders often repeat formulas related to time and hierarchies. To avoid this, refer to the module design principles (DISCO) and hold all the relevant calculations in a logical place. Then, if you need the calculation, you will know where to find it, rather than add another line item in several modules to perform the same calculation.
If a formula construct always starts with the same condition evaluation, evaluate it once and then refer to the result in the construct. This is especially true where the condition refers to a single dimension but is part of line item that goes across multiple dimension intersections. A good example of this can be seen in the example below:
START() <= CURRENTPERIODSTART() appears five times and similarly START() > CURRENTPERIODSTART() appears twice.
To correct this, include these time-related formulas in their own module and then refer to them as needed in your modules.
Remember, calculate once; reference many times!
Taking a closer look at our example, not only is the condition evaluation repeated, but the dimensionality of the line items is also more than required. The calculation only changes by day, as per the diagram below:
But the Applies To here also contains Organization, Hour Scale, and Call Center Type.
Because the formula expression is contained within the line item formula, for each day the following calculations are also being performed:
And, as above, it is repeated in many other line items.
Sometimes model builders use the same expression multiple times within the same line item. To reduce this overcalculation, reference the expression from a more appropriate module; for example, Days of Week (dimensioned solely by day) which was shown above. The blueprint is shown below, and you can see that the two different formula expressions are now contained in two line items and will only be calculated by day; the other dimensions that are not relevant are not calculated.
Substitute the expression by referencing the line items shown above.
In this example, making these changes to the remaining lines in this module reduces the calculation cell count from 1.5 million to 1500.
Check the Applies to for your formulas, and if there are extra dimensions, remove the formula and place it in a different module with the appropriate dimensionality .
Dimension Order affects Calculation Performance
Ensuring consistency in the order of dimensions will help improve performance of your models. This consistency is relevant for modules and individual line items. Why does the order matter? Anaplan creates and uses indexes to perform calculations. Each cell in a module where dimensions intersect is given an index number.
Here are two simple modules dimensioned by Customer and Product. In the first module, Product comes first and Customer second and in the second module, Customer is first and Product second.
In this model, there is a third module that calculates revenue as Prices * Volumes.
Anaplan assigns indexes to the intersections in the module. Here are the index values for the two modules. Note that some of the intersections are indexed the same for both modules: Customer 1 and Product 1, Customer 2 and Product 2 and Customer 3 and Product 3, and that the remainder of the cells have a different index number. Customer 1 and Product 2 is indexed with the value of 4 in the top module and the value of 2 in the bottom module.
The calculation is Revenue = Price * Volume.
To run the calculation, Anaplan performs the following operations by matching the index values from the two modules.
Since the index values are not aligned the processor scans the index values to find a match before performing the calculation.
When the dimensions in the module are reordered, these are the index values:
The index values for each of the modules are now aligned. As the line-items of the same dimensional structure have an identical layout, the data is laid out linearly in memory. the calculation process accesses memory in a completely linear and predictable way. Anaplan’s microprocessors and memory sub-systems are optimized to recognise this pattern of access and to pre-emptively fetch the required data.
How does the dimension order become different between modules?. When you build a module, Anaplan uses the order that you drag the lists onto the Create Module dialog The order is also dependent on where the lists are added. The lists that you add to the pages area are first, then the lists that you add to the rows area, and finally the lists added to the columns area.
It is simple to re-order the lists and ensure consistency. Follow these steps:
On the Modules pane, (Model Settings>Modules) look for lists that are out of order in the Applies To column. Click the Applies To row that you want to re-order, then click the ellipsis.
In the Select Lists dialog, click OK.
In the Confirm dialog, click OK.
The lists will be in the order that they appear in General Lists.
When you have completed checking the list order in the modules, click the Line Items tab and check the line items. Follow steps 1 through 3 to re-order the lists.
Subsets and Line Item Subsets
One word of caution about Subsets and Line Item subsets. In the example below, we have added a subset and a Line Item Subset to the module:
The Applies To is as follows:
Clicking on the ellipsis, the dimensions are re-ordered to:
The general lists are listed in order first, followed by subsets and then line item subsets You still can re-order the dimensions by double clicking in the Applies to column and manually copying or typing the dimensions in the correct order.
The calculation performance relates to the common lists between the source(s) and the target. The order of separate lists in one or other doesn’t have any bearing on the calculation speed.
If you have a multi-year model where the data range for different parts of the model vary, (for example, history covering two years, current year forecast, and three planning years), then Time Ranges should be able to deliver significant gains in terms of model size and performance.
But, before you rush headlong into implementing Time Ranges across all of your models, let me share a few considerations to ensure you maximise the value of the feature and avoid any unwanted pitfalls.
Naming Convention Time Ranges
As with all Anaplan models, there is no set naming convention, however we do advocate consistency and simplicity. As with lists and modules, short names are good. I like to describe the naming convention thus “as short as practical,” meaning you need to understand what it means, but don’t write an essay!
We recommend the using the following convention:
FYyy-FYyy. For example, FY16-FY18, or FY18 for a single year
Time Ranges available are from 1981 to 2079, so the “19” or the “20” prefixes are not strictly necessary. Keeping the name as short as this has a couple of advantages:
Clear indication of the boundaries for the Time Range
It is short enough to see the name of the Time Range in the module and line items blueprint
The aggregations available for Time Ranges can differ for each Time Range and also differ from the main model calendar. If you take advantage of this and have aggregations that differ from the model calendar, you should add a suffix to the description. For example:
FY16-FY19 Q (to signify Quarter totals)
FY16-FY19 QHY (Quarter and Half Year totals)
FY16-FY19 HY (Half Year totals only)
Time Ranges are Static
Time Ranges can span from 1981 to 2079. As a result, they can exist entirely outside, within, or overlap the model calendar. This means that there may likely be some additional manual maintenance to perform when the year changes. Let’s review a simple example:
Assume the model calendar is FY18 with 2 previous years and 2 future years; the model calendar spans FY16-FY20.
We have set up Time Ranges for historic data (FY16-FY17) and plan data (FY19-FY20)
We also have modules that use the model calendar to pull all of the history, forecast, and plan data together, as seen below:
At year end when we “roll over the model,” we amend the model calendar simply by amending the current year. What we have now is as follows:
You see that the history and plan Time Ranges are now out of sync with the model calendar.
How you change the history Time Range will depend on how much historic data you need or want to keep, but assuming you don’t need more than two year’s history, the Time Range should be re-named FY17-FY18 and the start period advanced to FY17 (from FY16).
Similarly, the plan Time Range should be renamed FY20-FY21 and advanced to FY20 (from FY19). FY18 is then available for the history to be populated and FY21 is available for plan data entry.
Time Ranges Pitfalls
Potential Data Loss
Time Ranges can bring massive space and calculation savings to your model(s), but be careful. In our example above, changing the Start Period of FY16-FY17 to FY17 would result in the data for FY16 being deleted for all line items using FY16-FY17 as a Time Range.
Before you implement a Time Range that is shorter or lies outside the current model calendar, and especially when implementing Time Ranges for the first time, ensure that the current data stored in the model is not needed. If in doubt, do some or all of the suggestions below:
Export out the data to a file
Copy the existing data on the line item(s) to other line items that are using the model calendar
Back up the whole model
The majority of the formulae will update automatically when updating Time Ranges. However, if you have any hard coded SELECT statements referencing years or months within the Time Range, you will have to amend or remove the formula before amending the Time Range. Hard coded SELECT statements go against best practice for exactly this reason; they cause additional maintenance. We recommend replacing the SELECT with a LOOKUP formulae from a Time Settings module.
There are other examples where the formulae may need to be removed/amended before the Time Range can be adjusted. See the Anapedia documentation for more details.
When to use the Model Calendar
This is a good question and one that we at Anaplan pondered during the development of the feature; Do Time Ranges make the model calendar redundant? Well, I think the answer is “no,” but as with so many constructs in Anaplan, the answer probably is “it depends!” For me, a big advantage of using the model calendar is that it is dynamic for the current year and the +/- years on either side. Change the current year and the model updates automatically along with any filters and calculations you have set up to reference current year periods, historic periods, future periods, etc.
(You are using a central time settings module, aren’t you??)
Time ranges don’t have that dynamism, so any changes to the year will need to be made for each Time Range. So, our advice before implementing Time Ranges for the first time is to review each Module and:
Assess the scope of the calculations
Think about the reduction Time Ranges will give in terms of space and calculation savings, but compare that with annual maintenance For example:
If you have a two-year model, with one history year (FY17) and the current year (FY18); you could set up a Time Range spanning one year for FY17 and another one year Time Range for FY18 and use these for the respective data sets. However, this would mean each year both Time Ranges would need to be updated.
We advocate building models logically, so it is likely that you will have groups of modules where Time Ranges will fall naturally. The majority of the modules should reflect the model calendar. Once Time Ranges are implemented, it may be that you can reduce the scope of the model calendar. If you have a potential Time Range that reflects either the current or future model calendar, leave the timescale as the default for those modules and line items; why make extra work?
As outlined above, we don’t advocate hard-coded time selects of the majority of time items because of the negative impact on maintenance (the exceptions being All Periods, YTD, YTG, and CurrentPeriod) When implementing Time Ranges for the first time, take the opportunity to review the line item formula with time selects. These formulae can be replaced with lookups using a Time Settings module.
Application Lifecycle Management (ALM) Considerations
As with the majority of the Time settings, Time Ranges are treated as structural data. If you are using ALM, all of the changes must be made in the Development model and synchronised to Production. This gives increased importance to refer to the pitfalls noted above to ensure data is not inadvertently deleted.
Best of luck! Refer to the Anapedia documentation for more detail. Please ask if you have any further questions and let us and your fellow Anaplanners know of the impact Time Ranges have had on your model(s).
This article provides the steps needed to create a basic time filter module. This module can be used as a point of reference for time filters across all modules and dashboards within a given model.
The benefits of a centralized Time Filter module include:
One centralized governance of time filters.
Optimization of workspace, since the filters do not need to be re-created for each view. Instead, use the Time Filter module.
Step 1: Create a new module with two dimensions—time and line items. The example below has simple examples for Weeks Only, Months Only, Quarters Only, and Years Only.
Step 2: Line items should be Boolean formatted and the time scale should be set in accordance to the scale identified in the line item name.
The example below also includes filters with and without summary methods, providing additional views depending on the level of aggregation desired.
Once your preliminary filters are set, your module will look something like the screenshot below.
Step 3: Use the pre-set Time Filters across various modules and dashboards. Simply click on the filters icon in the tool bar, navigate to the time tab, select your Time Filter module from the module selection screen, and select the line item of your choosing. Use multiple line items at a time to filter your module or dashboard view.
It is important to understand what Application Lifecycle Management, or ALM, enables clients to do within Anaplan.
In short, ALM enables clients to effectively manage the development, testing, deployment, and ongoing maintenance of applications in Anaplan. With ALM, it is possible to introduce changes without disrupting business operations by securely and efficiently managing and updating your applications with governance across different environments and quickly deploying changes to run more “what-if” scenarios in your planning cycles as you test and release development changes into production.
Learn more here: Understanding model synchronization in Anaplan ALM
Training on ALM is also available in the Education section 313 Application Lifecycle Management (ALM)
There are two different types of distributed models to consider as early as possible when a client chooses to implement Anaplan:
A split model is where one model, known as the primary model, is partitioned into multiple satellite models that contain the exact same structure or metadata (such as versions and dimensions) as the primary model. The split models will be 90% identical to the primary model and will have about a 10% difference. The split model method is most common when a client's workspace involves multiple regions.
For example, the primary model may contain three different product lines. Region 1 sells product lines A and B, while Region 2 sells only product C. In this case, a split model may provide consistency in structure across the models, but variation with the product lines since not all product lists are applicable to each region.
ALM application: Split models
For split models, ALM allows clients to maintain the primary model as well as all satellite models in their workspace using one development model. Clients may make changes to their development model, and then deploy updates to their live models without disrupting the application cycle.
Similar models are models that vary slightly in structure or metadata. The degree of difference is usually less than 5%. If it gets to be greater than this, or there’s a greater difference in user experiences, it may be impractical to use similar models. For example, you could use the similar models method if you have multiple regions that must view the same data, ideally from a master data hub.
ALM spplication: Similar models
For similar models, ALM requires clients to maintain one development model for each similar model in use. Comparable to split models, each development model may be edited, tested, and then deployed to the production model without disrupting the application cycle.
There is an easy way to see to which dashboards a module has been published. This can be particularly helpful when you are making changes to a module and need to know which dashboards the changes could impact. It can also be useful to reduce sparsity by identifying modules that might not be needed within a model. In other words, if a module is not used for any dashboards you can check to see if it’s needed for anything else and if it’s not, eliminate it.
In many situations, enterprises need to split very large and complex models for various reasons including:
Performance issues, including data volume and user concurrency
Metadata time cycle differences
Regional / business process differences
Anaplan is a platform designed to enable businesses to build models in almost endless configurations, so there is no pre-set size recommendation for where a model can be distributed. It is not uncommon for a 15-billion-cell model performing complex calculations to remain a single model, used by only a single person or just a few people. However, in contrast to that, it is also not uncommon to have a distributed model as small as 1 billion cells, with complex calculations and multiple people in multiple locations using the model.
As a general guide, this table takes into consideration the factors that influence a single model or distributed model solution.
Large Data Volumes, (> 10GB)*
High User Concurrency*
Sample Model 1
Sample Model 2
Sample Model 3
Depends on actual volume
Sample Model 4
Sample Model 5
Depends on actual volume
Sample Model 6
Depends on user concurrency
Sample Model 7
Depends on actual volume
* As always, apply appropriate testing and tuning to optimize the model. Different combinations can have a dramatic effect on desired performance and experience.
Anaplan has robust security across its platform. In some cases, it’s possible to achieve region-specific experiences using selective access. If this is the case, then distributed models are not necessary. But in mixed environments where model builders and end users operate in the same model, and where various business processes exist, at times it makes sense to separate or distribute models rather than have them in a single instance. For example, you may have different countries that all need access to a workforce planning application. You also have model builders from each country modeling and maintaining their section. By distributing the models and restricting access, this problem is abated.
Note: Where there is a need to segregate administration (model builder) roles, the split models will need to be in different workspaces, as the admin role is by workspace, not by model.
Metadata time cycle differences
A single instance of a model serving the world across multiple time zones does not respect the different business cycles involved, and therefore updates to data and/or metadata of a model will affect the entire community, some of whom may be in the middle of their planning cycle. These changes may be small, but in many instances are large-scale and frequent changes, which require pauses in the application cycle for end users.
However, a configuration that does respect business cycles and time zones and distributes the model can be beneficial to the business as business regions that are in down-time (e.g., in the middle of their night, where usage is very low) can, independently, carry out updates to data and metadata without affecting other regions.
ALM application: Metadata time cycle differences
Alternatively, ALM prevents pauses in the application cycle altogether by providing a development environment for each model. You may edit development models at any time without disrupting live production models for end users. Then, once you have completed your edits on the development model, you may deploy them to live production models without any disruptions or down-time for end users. As a result, using ALM removes any risk for pauses in the application cycle for any user at any time.
Regional / business process differences
Similar to the workforce planning example above, regional differences may exist. It may not be practical to attempt to include all regional variances that exist across countries for workforce planning in a single instance. Much of the functionality would not be relevant to every region, and so confusion and frustration would occur, as well as complication of user interface. In this instance a distributed model would be the best solution.
Another consideration is that of differing business processes. That is to say, both processes are intrinsically the same, but different enough to warrant separate treatment and business processes that are completely different.
An example of this may be a process where a business updates a forecast. Perhaps they get to the same point in a revenue forecast, but how different parts or divisions of a business get to that point is different. One may do an initial bottom-up forecast, submit up to management for draft approval, and then do a final submit. Another may do a top-down approach where they set a target and that target needs to be validated. These are connected, yet separate, processes that may warrant separate instances of an application.
ALM application: Regional / business process differences
If regional and business processes are similar between satellite models, and the metadata between them can be synced from a single development (primary) model, then ALM can be used to develop, test, and produce the single development model that feeds the satellite models.
If the regional and/or business processes cannot conform to use the same metadata from a single development model, then multiple development models must be used. In this case, ALM would be used to update, test, and produce each development model, which would then feed into each respective satellite model.
When changes occur to the primary model that need to be copied to the other models, careful coordination is necessary. There are several time-saving techniques that can make model changes across distributed models simple and quick. This depends on the complexity of the change, but generally changes are merely to fix an issue or add very small things such as views or reports.
Some of the model change techniques are:
Module update via export/import
Primary module is updated
Export of module blueprint to CSV format
Import of new line items into receiving module blueprint
Import of new formulas/dimensionality into receiving module
Model blueprint update
Model blueprints can also be updated on a batch basis where required
Simple copy and paste. Anaplan supports full copy and paste from other applications where minor changes to model structure are needed
You can export new lists or dimensions to a CSV file from one model to another, or you can carry out a direct API model-to-model import to add new lists to multiple models.
Changes to data or metadata happen in a different way. Item changes within existing lists or hierarchies occur via an import, which may take place in a specific model or models, or ideally within a master data hub.
It is a best practice to use an Anaplan model as a master data hub, which will store the common lists and hierarchies and will be the unique point of maintenance. Model builders will then implement automated data imports from the master data hub to every single model, including primary models and satellite models.
It is important to carefully consider the business processes and rules that surround changes to the primary model, and then the coordination of the satellite models, as well as clear governance.
ALM application: When changes occur
We highly recommend that clients utilize ALM if metadata changes, such as any dimension, may be required at any time during implementation or even after the deployment phase of Anaplan. ALM allows clients to add or remove metadata from models, as well as test their effects, in a safe environment without running the risk of losing data or altering functionality in a live production model.
NOTE: The following information is also attached as a PDF for downloading and using off-line.
The process of designing a model will help you:
Understand the customer’s problem more completely
Bring to light any incorrect assumptions you may have made, allowing for correction before building begins
Provide the big picture view for building. (If you were working on an assembly line building fenders, wouldn’t it be helpful to see what the entire car looked like?)
Understand the requirements and the customer’s technical ecosystem when designing a model
When you begin a project, gather information and requirements using a number of tools. These include:
Statement of Work (SOW): Definition of the project scope and project objectives/high level requirements
Project Manifesto: Goal of the project – big picture view of what needs to be accomplished
IT ecosystem: Which systems will provide data to the model and which systems will receive data from the model? What is the Anaplan piece of the ecosystem?
Current business process: If the current process isn’t working, it needs to be fixed before design can start.
Business logic: What key pieces of business logic will be included in the model?
Is a distributed model needed?
High user concurrency
Security where the need is a separate model
Regional differences that are better handled by a separate model
Is the organization using ALM, requiring split or similar models to effectively manage development, testing, deployment, and maintenance of applications? (This functionality requires a premium subscription or above.)
User stories: These have been written by the client—more specifically, by the subject matter experts (SMEs) who will be using the model.
Why do this step?
To solve a problem, you must completely understand the current situation. Performing this step provides this information and the first steps toward the solution.
Results of this step:
Understand the goal of the project
Know the organizational structure and reporting relationships (hierarchies)
Know where data is coming from and have an idea of how much data clean-up might be needed
If any of the data is organized into categories (for example, product families) or what data relationships exist that need to be carried through to the model (for example, salespeople only sell certain products)
What lists currently exist and where are they are housed
Know which systems the model will either import from or export to
Know what security measures are expected
Know what time and version settings are needed
Document the user experience
Front to back design has been identified as the preferred method for model design. This approach puts the focus on the end user experience. We want that experience to align with the process so users can easily adapt to the model. During this step focus on:
User roles. Who are the users?
Identifing the business process that will be done in Anaplan.
Reviewing and documenting the process for each role.
The main steps. If available, utilize user stories to map the process. You can document this in any way that works for you. Here is a step-by-step process you can try:
What are the start and end points of the process?
What is the result or output of the process? What does each role need to see/do in the process?
What are the process inputs and where do they come from?
What are the activities the user needs to engage in? Verb/object—approve request, enter sales amount, etc. Do not organize during this step. Use post-its to capture them.
Take the activities from step 4 and put them in the correct sequence.
Are there different roles for any of these activities? If no, continue with step 8. If yes, assign a role to each activity.
Transcribe process using PowerPoint ® or Lucid charts. If there are multiple roles, use swim lanes to identify the roles.
Check with SMEs to ensure accuracy.
Once the user process has been mapped out, do a high level design of the dashboards
Include: Information needed
What data does the user need to see?
What the user is expected to do or decisions that the user makes
Share the dashboards with the SMEs. Does the process flow align?
Why do this step?
This is probably the most important step in the model design process. It may seem as though it is too early to think about the user experience, but ultimately the information or data that the user needs to make a good business decision is what drives the entire structure of the model.
On some projects, you may be working with a project manager or a business consultant to flesh out the business process for the user. You may have user stories, or it may be that you are working on design earlier in the process and the user stories haven’t been written. In any case, identify the user roles, the business process that will be completed in Anaplan, and create a high level design of the dashboards. Verify those dashboards with the users to ensure that you have the correct starting point for the next step.
Results of this step:
List of user roles
Process steps for each user role
High level dashboard design for each user role
Use the designed dashboards to determine what output modules are necessary
Here are some questions to help you think through the definition of your output modules:
What information (and in what format) does the user need to make a decision?
If the dashboard is for reporting purposes, what information is required?
If the module is to be used to add data, what data will be added and how will it be used?
Are there modules that will serve to move data to another system? What data and in what format is necessary?
Why do this step?
These modules are necessary for supporting the dashboards or exporting to another system. This is what should guide your design—all of the inputs and drivers added to the design are added with the purpose of providing these output modules with the information needed for the dashboards or export.
Results of this step:
List of outputs and desired format needed for each dashboard
Determine what modules are needed to transform inputs to the data needed for outputs
Typically, the data at the input stage requires some transformation. This is where business rules, logic, and/or formulas come into play:
Some modules will be used to translate data from the data hub. Data is imported into the data hub without properties, and modules are used to import the properties. Reconciliation of items takes place before importing the data into the spoke model.
These are driver modules that include business logic, rules.
Why do this step?
Your model must translate data from the input to what is needed for the output
Results of this step:
Business rules/calculations needed
Create a model schema
You can whiteboard your schema, but at some point in your design process, your schema must be captured in an electronic format. It is one of the required pieces of documentation for the project and is also used during the Model Design Check-in, where a peer checks over your model and provides feedback.
Identify the inputs, outputs, and drivers for each functional area
Identify the lists used in each functional area
Show the data flow between the functional areas
Identify time and versions where appropriate
Why do this step?
It is required as part of The Anaplan Way process. You will build your model design skills by participating in a Model Design Check-in, which allows you to talk through the tougher parts of design with a peer.
More importantly, designing your model using a schema means that you must think through all of the information you have about the current situation, how it all ties together, and how you will get to that experience that meets the exact needs of the end user without fuss or bother.
Result of this step:
Model schema that provides the big picture view of the solution. It should include imports from other systems or flat files, the modules or functional areas that are needed to take the data from current state to what is needed to support the dashboards that were identified in Step 2. Time and versions should be noted where required. Include the lists that will be used in the functional areas/modules.
Your schema will be used to communicate your design to the customer, model builders, and others. While you do not need to include calculations and business logic in the schema, it is important that you understand the state of the data going into a module, the changes or calculations that are performed in the module and the state of the data leaving the module, so that you can effectively explain the schema to others.
For more information, check out 351 Schemas. This 10 to 15 minute course provides basic information about creating a model schema.
Verify that the schema aligns with basic design principles
When your schema is complete, give it a final check to ensure:
It is simple.
“Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius — and a lot of courage to move in the opposite direction.” ― Ernst F. Schumacher
“Design should be easy in the sense that every step should be obviously and clearly identifiable. Simplify elements to make change simple so you can manage the technical risk.” — Kent Beck
The model aligns with the manifesto.
The business process is defined and works well within the model.
Once a model is built, testing of the user concurrency and data load levels occurs, and then optimizing the system for the specific use case and conditions is carried out. Then, we have three main options in order to tune for optimum performance. These are the main optimization options:
1. Model design
Is the model designed correctly?
Have you reduced sparsity and unnecessary complexity?
Is the model too big?
Have you neatly designed the model to have input, engine, and output modules?
Have you cleaned up as you go?
Problems often exist when you have added to the model, tested something that did not work out, and then not removed what you tested that didn’t work. This piece is not fulfilling any requirements. Sometimes we refer to this as model debt. Remember, Anaplan is a living, breathing model and so any line items that exist in the model, whether used or not, are used by the engine. A surplus piece (model debt) is an inefficient use of model space.
2. Model calculations
Check that calculations are as efficient as possible. Are you using standard functions to be more efficient?
3. Platform code
Do we need to engage L3 and/or engineering to look at code optimization?
Performance issues including data volume and user concurrency
Performance and the experience the end user has are of critical importance when deploying applications to a wide audience. Therefore, several factors need to be considered when deploying, in order to optimize performance and determine whether a single instance or distributed instance strategy is best:
In order for end users to enjoy the best possible experience and have an average response less than two seconds to most popular requests, model size and concurrency must be managed appropriately. In many cases a base model is produced that contains all the dimensionality and calculation logic and then the model is subjected to a series of tests that determine what the end user experience and model performance will be.
The first test is a load test where data is loaded into the model to simulate what the production model volumes would actually be. During this test, basic functions are performed such as data input, allocations, filtering, pivoting, sorting, list formatted item drop down manipulation, etc. This is done both in an automated fashion and via human intervention. If you determine that some or many functions are slow and server memory and CPU are used to the maximum, this is likely a case for distribution. If however, the model is slow, but user concurrency is minimal, then this could form a case for a single model instance as the system is merely processing numbers and not being accessed by a user community. Otherwise, this model could also be split to provide a better user experience.
The size of a model measured in number of cells or in memory size is a good indicator for splitting a model. We are setting the expectation that a model size should not go beyond 15B cells or 120 GB of memory. Therefore, if an application requires 30B cells, it should be split in two models. Here’s an example of how a split model decision can be made:
First, estimate the size of the application: List the main dimensions that will be used in each application and define the expected number of cells for each the valid combinations of dimensions (these will be modules).
# of cells for the group
Customer: 80 (incl. hierarchy rollups)
Product: 1500(incl. rollups)
Time: 36 months, 12 Q, 3 Year, 3 YTD
Version: 2 (Actual, Budget)
Line items: 50 metrics
Time (should be same for each group)
Versions (should be same for each group)
Total Application 1
Then summarize how many models will be needed for each application.
Estimated size in cells
Estimated size in GB
# required models
The second test is user concurrency. If you have an application that requires a large user base to interact with it, a user concurrency test should be performed. As a general rule, user concurrency is approximately 10% of the total user community. Therefore, if you have a total user base of 1000, around 100 people will be on the live system performing tasks at any given time. It is usually unlikely that many more than that number would be accessing the system at the same time.
In some cases though, applications follow a set high concurrency pattern and this needs to be taken into account. For example, a weekly sales forecast may have 1000 users on the system, but very likely each Sunday (if forecasts are due Monday) the user concurrency will be quite high, maybe as high as 50–60%. Your processes and experience will determine exact concurrency in high traffic applications or periods. The best approach to get to the right number of users in a model is to test the concurrency with automated tests, and then with manual tests that include a large number of real users.
First, start with User Acceptance Testing (UAT). In short, UAT involves human users simultaneously performing scripted tests inside the platform. During these tests, system behavior will be monitored and reported by each of the human testers, which may be provided via a user survey that is launched post UAT.
Then, automated testing can be performed in the platform. Automated testing simulates user actions across the platform. To do this, coordinate with the Anaplan QA Team to schedule automated testing of load, performance, and concurrency.
It is also important to monitor the server while the automated testing is in place to monitor memory and CPU usage. The Anaplan QA team can obtain server monitoring metrics as part of the model performance testing process. In either case, application tuning needs to happen to optimize for all conditions needed.
Multi-model application optimization
The application tuning lifecycle includes a 2-step, iterative tuning process that reoccurs during the model building process. Step 1 is carrying out the complete build. Step 2 is tuning at the application level (i.e. optimizing the design and the calculations or business rules) by Anaplan’s L3 Support team and the solution architect. You may also make additional platform level or code optimizations with the assistance of Anaplan’s engineering department on rare occasions.
This is step four of the model design process.
Next, your focus shifts to the inputs available. Remember that sometimes a dashboard is used to add information.
Using the information gathered in steps 1 through 3:
Identify the systems that will supply the data
Identify the lists and hierarchies, especially the hierarchies needed to parse out information for the needed dashboards/exports
What data hub types are needed?
Why do this step?
During this step, you should be thinking about the data needed to get to your defined output modules. All of the data in the system or in lists may not be needed. In addition, some hierarchies needed for the output modules may not exist and may need to be created.
Results of this step:
Lists needed in the model
Hierarchies needed in the model
Data and where it is coming from
Importing and exporting data in and out of Anaplan area fairly simple processes, however they can significantly affect the platform’s performance, as well as the business consistency.
These articles take a closer look at this topic.
How do exports impact platform performance?
How do exports impact business consistency?
How do I troubleshoot issues related to exports?
How do I create an export model?
Each time a user runs an import or an export it affects platform performance, as they will block all other users of the model from performing any tasks while the import or export runs. This creates what is called a toaster message: basically a blue box at the top of the Anaplan screen that indicates to every connected user that the platform is processing an action.
Any person who frequently exports out of Anaplan will likely become very unpopular among the users of the model, especially if exports last more than a few seconds. Users who are not workspace administrators can:
Export data out of a module within a dashboard
Run an import prepared by an administrator
Run a process that an administrator has prepared.
The process can combine a number of imports and exports
First, the bigger your model is the more performance issues you are likely to experience. So a best practice is to use all the possible tools & features we have to make the model as small and dense as possible. This includes:
Line Item Checks: summary calculations, dimensionality used
Line Item Duplication
Granularity of Hierarchies
Use of subsets and line item subsets
More information on eliminating sparsity can be found in Learning Center courses 309 and 310.
General recommendations also include whenever possible, challenging your customer’s business requirements when customer require large list (>1M), big data history and high number of dimensions used at the same time for a line item (>5)
Once these general and basic sparsity recommendations have been applied, you can further performance in different areas. The articles below will expand on each subject:
Imports and exports and their effects on model performance
Rule 1: Carefully decide if you let end-user import (and export) during business hours
Rule 2: Mapping Objective = zero errors or warning
Rule 3: Watch the formulas recalculated during the import
Rule 4: Import List properties
Rule 5: Get your Data HUB
Rule 6: Incremental import/Export
Dashboard settings that can help improve model performance
Rule1: Large list = Filter these on a boolean, not on text
Rule 2: Use the default Sort
Rule 3: Reduce the amount of dashboard component
Rule 4: Watch large page drop downs
Formulas and their effect on model performance
Model load, Model Save, Model Rollback and their effect on model performance
User roles and their effect on model performance