Your Anaplan workspace allowance is tied to your subscription rate and it pays you to build efficient models—a model might meet your business analysis and planning needs very adequately but cost you due to its excessive size. Anaplan's cloud solution offers enviable levels of data cube capacity and calculation performance across multiple dimensions, but this does not mean you should ignore model size when designing and building models. Anaplan offers several robust modeling features for keeping models lean and efficient. Use this section to learn how to manage your model size and avoid overheads, such as unnecessary data duplication or multiplication of empty data cells (sparsity).
There are some quick and easy checks you can make to ensure you are optimizing model size as you build your models and add modules. Also, there are some very useful built-in features and modeling techniques you can use, all designed to help you avoid unnecessary model size increases:
Here's some tricks on how to optimize models for size as you build a model and add modules:
Line Item Checks
Review the “summary” setting on line items. If a sum is not being used, then it should be set to “None”. See Summary Methods.
Review the dimensionality of line items. Often, line items don't need to have the same dimensions as the module. Also, look at the timescale and versions dimensions to see whether some line items need not have these dimensions applied, even though the module does.
Line Item Duplication
Try to limit duplication between modules. If a line item is duplicated between modules, then make sure there's a valid reason (such as different permissions).
Granularity of Hierarchies
Review any hierarchies you have built into your model to ensure the level of granularity is adequate and appropriate to represent the data. If you have hierarchies which allow a finer granularity than the data requires, this will prove costly in terms of model size—keep hierarchy granularity as coarse as possible to meet data requirements.
Using Line Item Subsets
Line item subsets let you group together a set of line items that belong to one or many modules. When you create a line item subset, you can use it as a normal list—either as a dimension in another module or for list formatting a line item.
Use a line item subset to:
Re-use existing line items in other modules and avoid duplication of those line items. See Line Item Subsets for details of how to create and use line item subsets.
Use a line item subset as a dimension on a module. You can then use a COLLECT function to pull in data values from the source modules to which the original line items belong and avoid unnecessary data duplication. See the COLLECT page for details of how to do this with examples.
Using Numbered Lists
When you create a numbered list, each item in the list is assigned a unique, system-generated ID number and the display name for the item is optional. Numbered lists allow you to avoid data sparsity in your models. Sparsity occurs when data cells that will always remain empty appear in your modules. These empty cells increase the size of your model. For example, a common use case that can generate sparsity is a module that tracks Product Sales by Sales Rep by Customer. Most Sales Reps will sell more than one Product to more than one Customer, but it is very unlikely that any single Sales Rep will sell all Products to all Customers. Consequently, many cells will always be empty, those that sit at the possible unused intersections of Product Sales/Sales Rep/Customer.
You can use numbered list to prevent this data-sparsity overhead—you can represent only those intersections that you know will hold data and will be valid combinations. A numbered list lets you ensure that your module's line items realise the valid data combinations and only the valid combinations.
Using numbered lists to prevent sparsity works by representing in a hierarchical structure only the valid combinations—those that will carry a value—of what would have been intersections of multiple dimensions. This is one strategy to limit dimensions when adding modules and building out your model. Managing dimensionality to limit dimensions and preserving data-density is a key part of creating efficient models:
When deciding which modules you need in a model, try to design your data modules to respect a "natural dimensionality", in that they are used to express the intrinsic relationships between data-sets. For example cost-center data and employee data.
If you find you are having to use more than 5 dimensions in a module, it is worth reviewing the purpose of the module and asking if your really need two separate modules instead.
Design modules to serve a general role within an overall model structure and to serve data management:
Source modules, by carrying relatively few dimensions, can be kept data-dense and avoid sparsity.
Results modules can calculate specific outcomes tailored to specific purposes and based on data gathered from source modules.
Data management across a model's lifecycle can all be done at the level of the baseline data-source modules and data updates to baseline data are immediately propagated through to results modules.
The SUM aggregation function and the LOOKUP function are designed to facilitate this modeling approach, because you can use them to pull and aggregate data from data-dense source modules into results modules. See the SUM and LOOKUP pages for details of how to use these functions with examples. These functions can also be combined together—see SUM & LOOKUP for details and examples.
Ranking Modules by Size
You can check the relative footprint of modules in a model at Settings > Modules. Under Cell Count, you can read-off each module's size:
A good way to identify the modules that are having the most significant impact on model size is to export the module. Go to Settings > Modules and export to Excel. Once exported, you can sort from highest to lowest on cell count.