Model Optimization Checklist

Use this checklist to ensure that your model aligns with best practices and Planual guidelines for optimal model performance.
Optimizing Your Model

Optimizing Your Model

Are you an expert model builder who enjoys tweaking your models so you can shave micro-seconds off load times? Maybe you’ve been assigned the task of taking a poorly performing model and getting it back up to speed. Or you are looking for a way to take your model building skills to the next level.

We’ve got some exciting news for you! We’ve looked at the principal ways that you can optimize model performance and created a checklist. It builds on what we’ve provided in the Planual and takes it a step further. The checklist helps you identify issues that may impact performance, explains why, and includes the actions you can take to fix them.

Use the checklist:

  • • When building new models to ensure that you are following best practices. Add going through the checklist as part of every sprint review.
  • • When performing a yearly review of models, as some COEs may require. Use the checklist to help complete the review.
  • • When you have a model that is experiencing a problem with performance. Some of the items in the checklist can be corrected easily, others may take some time.
  • • When coaching or mentoring new model builders. Encourage them to use the checklist to instill good model building habits right from the start.

How to use the checklist:

The checklist includes questions. Click the > to display more information. Each question topic includes more details about when and why this may be a problem, how to correct it, and links to articles and short lessons that provide more information.

Step 1 - Lists

Does the list use Time as part of the key?

When is this a problem?

This is most often seen in a transactional list when creating a unique value for an item to import to a list.


Why is this a problem?

Thinking about this from a list creation point of view, do you really want to create a new list item for every date, or do you want to create the list item once? If you include Time as part of the key, your list will be unnecessarily large.

 

How to correct:

Dates are data and should not be part of a code. Remove the dates and use time references outside of the list.

 

More information:

Planual 1.05-11

Check out the Transactional section of Data Hub Best Practices

Does the list use a combination of properties as a code?

When is this a problem?

Always.

 

Why is this a problem?

This practice taxes the system and leaves you without a code, making it much harder to map data when importing to the list.

 

How to correct:

Exception: If there is no code in the source, you'll have to use a combination of properties. This is rare and should always be a last resort. Work with the IT department to have the source system create a unique code.

 

More information:

Planual 1.05-04a

Does the list have a code?

When is this a problem?

All lists should have a code.

 

Why is this a problem?

A code makes loading and using a list more efficient, so it is best practice to include a code, especially for numbered lists.

 

How to correct:

Check source system availability for creating a source code, as this is often possible. If not, you can use Anaplan to create a code.

 

More information:

Creating Codes for Attributes 

String Concatenation

Does the list use Properties?

This is a problem when the properties don't meet the following use case needs:

  • Display Name for numbered lists
  • Creating actions using numbered lists
  • Export labels
  • Facilitating navigation to different dashboards (classic dashboards)
  • Creating driver and dependent lists

Why is this a problem?

List properties are the same as line items but have many limitations. List properties add to the overall size of the list upon load. It is easier to write formulas in modules using line items than it is to write formulas using list properties. There isn't an easy way to control the summary of properties. It is best to keep it simple. When possible, include these as line items rather than properties. This keeps the calculations in one place in the module.

 

How to correct:

A systems module that includes properties allows you to store the data in a central location that other model builders can use. The same can be said for a list, but keeping them in a module provides easier formula writing and the ability to control how they are summarized. To create the systems module, first, create a new list that includes the properties. Then, create a module that includes the list that the properties should be applied to. Format the line items using the list format and selecting the property list.

 

More information:

Planual 1.05-03

 

Are there subsets on larger lists?

When is this a problem?

This is a problem when large lists include subsets.

 

Why is this a problem?

Large lists that include subsets can cause performance problems. They add aggregations and add to the overall size of the model.

 

How to correct:

Instead of a subset, create a separate list.

 

More information:

Planual 1.06-02

 

Does the subset list follow proper naming conventions?

When is this a problem?

Naming conventions should be used for all subset lists.

 

Why is this a problem?

Subset lists that do not use naming conventions can cause confusion. Without proper naming conventions, it can be difficult to determine which list the subset belongs to. This becomes more important when using the subset during the creation of a module.

 

How to correct:

Use these guidelines when developing a naming convention for subsets:

  • A prefix to indicate that a list is a subset of a larger list.
  • The name of the list where the subset is defined.
  • A colon followed by a brief description of the subset.

 

More information:

Planual 1.06-01

Anapedia: Naming Conventions

 

Does the list include a top level when it is not necessary?

When is this a problem?

When you have an unnecessary top level on a list that does not need to be summarized. Examples include a transaction list, currency list, or any other list you are using for an indicator where you do not need to see a top level or summary.

 

Why is this a problem?

When you have an unnecessary top level on a list, the engine runs calculations and recalculations that can cause a decrease in performance. This happens when summary settings are left on. Any change in a value will cause a recalculation for the summary on the top level.

 

How to correct:

Remove the parent of all from the list.

 

More information:

Planual 1.05-05

Planual 1.05-06

Planual 1.05-07

Understanding Top Level

Top Level Item and Parent Hierarchy

 

Step 2 - Modules

What dimensions are used in the module? Are they all necessary?

When is this a problem?

This is a problem when you have unnecessary dimensions—meaning that the logic or data in the module does not apply to the dimension.

 

Why is this a problem?

Unnecessary dimensions in a module could cause poor performance from a formula having to perform unnecessary calculations and increasing the size of the model. It can also hinder the end-user experience by causing them to have to use unnecessary context selectors, which can lead to confusion or cause extra difficulty trying to get to their desired view.

 

How to correct:

Remove the unnecessary dimensions. This could cause your formulas to become invalid and thus cause an error. If this is the case, set your format to No Data, remove the dimension, update your formula, and then reset the format. You may have to modify downstream line items if they use the current line item as a mapping or variable.

 

More information:

Model Size and Dimensionality

The Truth About Sparsity: Part 1

What time dimensions are being used in the module? Can a Time Range be used?

When is this a problem?

This is a problem in models where the time requirements of the module differ from the Model Calendar. This could mean that the year requirements are outside of the model calendar or do not need all of the years in the model calendar. It is also a problem to include unnecessary time summary settings.

 

Why is this a problem?

The model quickly becomes very large.

 

How to correct:

Identify the requirements and build a proper time range to account for this. Remember: Time ranges will not dynamically update when you adjust your current fiscal year. Be sure and have an end-of-year process to update your time ranges.

 

More information:

Introduction to Time Ranges

 

Are the dimensions in order?

When is this a problem?

This is a problem in all models. During a model build, lists are organized in the way that they are added.

 

Why is this a problem?

Through research, we discovered that dimensions that are not in order can make the engine work harder.

 

How to correct:

On the Modules pane (Model Settings>Modules) look for lists that are not in order in the Applies To column. Click the Applies To row that you want to re-order, then click the ellipsis. In the Select Lists dialog, click OK. In the Confirm dialog, click OK.

 

More information:

Dimension Order

 

If there are any saved views, do they use the correct filtering?

When is this a problem?

This is a problem when saved views use multiple filters per tab.

 

Why is this a problem?

Multiple filters on a tab can cause performance issues. The data is read for the first filter, then those results are read for the next filter, etc.

 

How to correct:

Filters should be set up using Boolean values and have only one line item filtering each dimension. If you need to, consolidate filtering line items into one line item. Be sure to use the Notes section to identify that this line item is used in a filter.

 

More information:

Filtering Best Practices

 

If there are saved views, are all of the line items in the view necessary?

When is this a problem?

This is a problem when the view includes line items that aren't needed.

 

Why is this a problem?

The extra line items can impact performance because the view has to render the extra line items. For example, if three line items are needed, but 25 line items are defined, performance will be impacted.

 

How to correct:

Remove line items that aren't needed. Use Show on desired columns, instead of Hide on non-desired columns, so new line items are not added to the saved view.

 

Is the default view clean?

When is this a problem?

This is a problem when the default view has filters applied, hidden list or line items, or includes conditional formatting or sorting.

 

Why is this a problem?

The default view of the module should always be a blank canvas for any model builder to modify. The expectation is that nothing is hidden or modified.

 

How to correct:

Sorting, Filtering, and conditional formatting can be seen in the module toolbar and can be easily identified and changed. To view hidden list or line items, use Show All Rows or Show All Columns to fix this. Keep in mind, you may have hidden list or line items in your Page Selectors. Use the pivot to view to fix this if necessary.

 

More information:

Planual 2.04-03

 

Step 3 - Line Items

Which line items do not need a summary?

When is this a problem?

When a new line item is created, a summary method is added by default. Many line items do not need summary calculations.

 

Why is this a problem?

Running unnecessary calculations can increase the time it takes to open a model.

 

How to correct:

When creating a new line item, make it a habit to turn off summarizing. If a summary is needed, then turn the summary back on.

 

More information:

Planual 2.03-01

Are text formatted line items overused?

When is this a problem?

Anaplan is optimized for numbers and Booleans.

 

Why is this a problem?

Text uses more memory than most other model components and does not perform as well as other data types. Text limits capabilities and impacts model size.

 

How to correct:

When possible, convert text data to another data type. Check out the Anapedia article and the lesson referenced here for more information.

 

More information:

Planual 2.03-02

Anapedia: Convert between Data Types

Training: Converting Data Formats

Are there more than 50 line items in a module?

When is this a problem?

When any module includes more than 50 line items.

 

Why is this a problem?

It goes against the best practice, which is to not include more than 50 line items in a module.

 

How to correct:

Break up the module into separate modules, especially if any of the line items use subsidiary views.

 

Are there excessive subsidiary views?

When is this a problem?

Subsidiary views should be avoided for data that is used in calculations.

The following are the situations where a subsidiary view can be used (note that they all are related to how the data displays):

  • Show an attribute for end-user analysis, filtering, or sorting.
  • Show an alternative hierarchy for a dimension.
  • Display numeric values, such as ratios, that are not needed for other calculations.
  • Filter reports or provide attributes for exports.

Why is this a problem?

Subsidiary views are hard to locate and audit in a model. If a line item with a subsidiary view is published to a classic dashboard, the data cannot be edited.

 

How to correct:

Think about adding a new module that includes the dimensions of the subsidiary view. See the Subsidiary Views lesson for examples.

 

More information:

Subsidiary Views 

Are there line items used as headers where the format is not set to No Data?

When is this a problem?

Line items that do not follow this convention.

 

Why is this a problem?

Using line items as headers without the format set to No Data causes unnecessary calculations.

 

How to correct:

Change the format to No Data to avoid unnecessary calculations. See the article on clearing data for more information on how the No Data format can also be used to clear data in Anaplan.

 

More information:

Planual 2.03-03

How to Clear Data in Anaplan

Step 4 - Formulas

Are there line items that can be placed in a systems module rather than within a calculation module?

When is this a problem?

Look for functions that result in data that doesn't change (i.e., the parent of a line item). Placing these functions in a systems module means that the calculation is run once and can be referenced by multiple line items in multiple modules. Look for: PARENT(PARENT(PARENT(line item))), text field concatenations, and time functions such as START(), CURRENTPERIODSTART(), CURRENTPERIODEND(), etc.

 

Why is this a problem?

Calculations that are repeated can slow down performance. When these types of functions are included as part of longer formulas, think about how many times they may run if they are included in multiple line items and for multiple list members. It is better for performance for the calculation to run once and be referenced many times.

 

How to correct:

Locate the functions that can be moved to a systems module. Create the systems module, add the function as a line item, and then change the original calculation to reference the line item in the systems module.

 

More information:

Planual 2.02-18

Planual 2.02-03

Planual 2.01-04

Creating Systems Modules

Build a Time Systems Module

Reduce Calculations for Better Performance 

Are there any formulas that that combine text strings?

When is this a problem?

The use of text strings in Anaplan cannot be avoided, so you must ensure that the joins are done in the most efficient way to minimize the impact on performance. Be sure to read the Formula Optimization article. You can find the link in the More Information section below.

 

Why is this a problem?

Combining text strings takes a lot of memory and can impact performance.

 

How to correct:

Create systems modules that only include the dimension needed. For example, if you want to create a code that consists of an item code and a company code joined together, you would create three systems modules. One that holds the item data, another that holds the company data, and a third that holds the combination of the two.

 

More information:

Formula Optimization

String Concatenation

 

Are there any formulas that are daisy chains?

When is this a problem?
Daisy chain formulas are always a problem. Daisy chains happen when you have a line item in module A that is referenced in module B, which is then referenced in module C, and so on.

daisy chain.png
This creates a situation where there is a long dependency chain that has to be calculated. Instead, each module should directly reference the data. The line item in module A is referenced directly in module B and is also referenced directly in module C. 

Recommended App.PNG

Why is this a problem?

Daisy chain formulas require the engine to work harder. Whenever one data point that is used in the calculation changes, the entire sequence must be re-calculated.

 

How to correct:

Break up these formulas. Place parts that need to be referenced multiple times in a systems module, creating a single source of truth.

 

More information:

Planual 2.02-19

This article explains daisy chains

 

Are there long formulas, such as complex IF THEN ELSE statements?

When is this a problem?

If you can't explain the purpose of a formula in one sentence, the formula is too long.

 

Why is this a problem?

Long formulas, such as complex IF THEN ELSE statements, are difficult to understand, audit, and maintain. Complex formulas use valuable processing time and require the engine to work harder than necessary.

 

How to correct:

Simplify complex IF THEN ELSE statements and put the most common condition first. Break up formulas into multiple line items. If your formula includes more than 10 IF THEN calculations, think about using a LOOKUP. In some situations, you can also consider using a line item subset.

 

More information:

Planual 2.02-17

Planual 2.02-18

Formula Structure for Performance

Thinking Through Your Formulas

Line Item Subsets Demystified

Manipulating Data with Line Item Subsets

Decreasing the Length of Your Formulas

Is there an excessive use of the FINDITEM function?

When is this a problem?

The FINDITEM function requires a lot from the engine, as it is attempting to find an item in a list and it does this for every cell that is related to that line item. This is especially true when the list includes null values. Excessive use depends on your model and the lists it is being run against.

 

Why is this a problem?

Using FINDITEM excessively makes the engine work harder than necessary.

 

How to correct:

Check the data for null values before running FINDITEM. So where prod_text is the name of the list, the formula to check for null values is IF ISNONBLANK(prod_text) THEN FINDITEM(prod_text) ELSE blank. This function is often used when loading data. Think about loading the data into spoke models from the data hub as list formatted line items rather than text.

 

More information:

Planual 2.02

 

Are there SUM and LOOKUP combinations?

When is this a problem?

Having SUM and LOOKUP in the same line item is always a problem and can cause performance issues.

 

Why is this a problem?

Using SUM and LOOKUP in the same formula generally causes large formula calculations, especially if time is a dimension or when the source and target structures are very different.

 

How to correct:

Split SUM and LOOKUP into different modules and line items.

 

More information:

Planual 2.02-08

The LOOKUP Function

How to Use the SUM Function

Compound Functions: SUM, LOOKUP, SELECT

Is the TEXTLIST function used?

When is this a problem?

Always a problem.

 

Why is this a problem?

Using TEXTLIST requires a lot of memory.

 

How to correct:

Using two-dimensional modules and Boolean flags with ANY is a good alternative. Click here to see an ANY instead of TEXTLIST document. Use FIRSTNONBLANK and LASTNONBLANK with ANY and ALL.

 

More information:

Planual 2.02-11

Anapedia: x[ANY:y]

Do any formulas use the SELECT function, especially with time?

When is this a problem?

Using SELECT is considered hardcoding and not recommended.

It is OK to use SELECT for versions and general time periods.

 

Why is this a problem?

Using SELECT is hard coding and not recommended. When you use SELECT with a specific list item or time, you limit the ability to update those list items. This is especially an issue with time because a change will be required every year. You’ll need to locate every line item that references the SELECT function and make updates. These changes can only be performed by a Workspace Administrator (WSA), compared to the solution we recommend, which allows an end user to make changes.

 

How to correct:

Create a module with no dimensions to hold assumptions for Time, and other "SELECT" values or use LOOKUP instead of SELECT. For example, in this module, create a line item called Current Year, format it as a Time Period and select year. You can then use a lookup formula to reference this line item. When the current year changes, you update the value in that line item and it will automatically update all the formulas that reference it.

 

More information:

Planual 2.02-14

Using SELECT

Planual 2.02-12

Avoiding Hard Coding

Do any of the line items use conditional formatting with summaries turned on?

When is this a problem?

This is a problem because conditional formatting is done at the detail level, not the parent level.

 

Why is this a problem?

The summary is unnecessary.

 

How to correct:

Turn off the Summary settings.

 

More information:

Planual 2.03-01

 

Step 5 - Actions

Are all the actions necessary?

When is this a problem?

When there are unused actions that haven't been deleted.

 

Why is this a problem?

It is very hard to audit which actions are being used.

 

How to correct:

Remove any unused actions.

 

More information:

Planual 5.01-04

Best Practices for Model Cleanup

Do any of the actions result in errors when they are run?

When is this a problem?

When you receive a yellow exclamation point, that is an error. You may also see warnings. Both errors and warnings need to be corrected. The action or process should show all green checkmarks.

 

Why is this a problem?

Imports will run, but the errors and warnings will result in missing data.

 

How to correct:

Always read your errors and warnings and correct them. If you ignore errors or warnings, you run the risk of missing and/or incorrect data. Remember that in many cases, a model is a single source of truth, and mistakes with data will cause users to distrust the model.

 

More information:

Planual 5.04-04

Planual 5.04-07

Planual 5.04-09

Import Errors

 

Check the data sources. Are some of them lists or modules?

When is this a problem?

All sources of data should either be a file or module view, not a list or module.

 

Why is this a problem?

When using a list or module as a data source, there are no filtering mechanisms available.

 

How to correct:

It is a recommended best practice to use a saved view when importing data from one model to another, especially for data loads that are repeated or updated frequently as part of an action or process. See the lessons Filtering Best Practices and Saved Views for Importing for more information. You may want to rename the saved view source to reflect what the saved view represents.

 

More information:

Planual 5.04-08

Filtering Best Practices

Saved Views for Importing

Data Hubs: Purpose and Peak Performance

 

Check the data sources. Does each data source have an action associated with it?

When is this a problem?

Data sources without an associated action are unnecessary.

 

Why is this a problem?

It can be confusing to have data sources without actions. It is best to keep things tidy.

 

How to correct:

Delete data sources that do not have actions associated with them.

 

More information:

Planual 5.03-01

Anapedia: Naming Conventions 

Step 6 - Time

Is the main time scale of the model larger than it needs to be?

When is this a problem?

Each year that is included in the model calendar increases the size of the model. When used for planning purposes, it's likely that only certain data sets are needed, leaving many empty cells.

 

Why is this a problem?

It is unnecessary to increase the model calendar years when Time Ranges are available.

 

How to correct:

Use Time Ranges for data that are earlier or later than the current model calendar.

 

More information:

Anapedia: Time Ranges

Introduction to Time Ranges