Architecture and Modeling Mistakes: Deep Dive
In today’s post, we’re doing a deep dive on the Anaplan Live! session: Top 10 Anaplan Implementation Mistakes and How to Avoid Them. Focusing on the Architecture and Modeling section, we’ll review a few best practices discussed and illustrate how we can leverage them in our daily model-building activities.
Why You Need a Data Hub
Before any implementation, one of the first models that should be set up is the Data Hub. There have been Anaplan implementations without a Data Hub, however, this inhibits the Anaplan ecosystem from driving Connected Planning. Specifically, not having a Data Hub negatively impacts the source of truth, end-user experience, performance, segregation of duties, and limits the ability for different data sets to be leveraged for future modeling.
- Single source of truth: Creating a separate model for data and dimensions establishes a common source of the truth. The model builders/admins can leverage the Data Hub for hierarchies, metadata, and data needed to build spoke models. The Data Hub should not be used for analytical or reporting purposes.
- End-user experience: The spoke model should only have those calculations and reports the end user wants to see. Manipulation, validation, and consolidation of data should happen in the Data Hub which will ensure the end user of the spoke model doesn’t have any downtime during large file uploads.
- Performance: Having too many things in one model affects the performance and is not always necessary. One important point in the Planual is ‘Just because you CAN doesn’t mean you SHOULD.’ Just because we can store data and run business calculations in the same model, doesn’t mean we should as it adversely impacts the model.
- Segregation of duties: Data and metadata may need to be mapped or lightly cleaned in Anaplan. Ideally, any modifications to source data should occur upstream (before it gets into Anaplan). However, in the initial phases, some of this may need to occur in Anaplan. The Data Hub should have a workspace of its own for better security and easy handling of admin tasks.
- Drive future builds: Future builds can be accelerated with a) availability to source data and b) consistent naming of list members across business functions. For example, a source systems data loaded into the Data Hub for model one may also be needed for model four, and having the data loaded into Anaplan already will speed up model four’s implementation timeline.
Common Mistakes in Data Hubs
After deciding to set up a Data Hub there are some key mistakes to avoid. Sometimes after setting up a Data Hub, the team notices it is taking up a significant amount of space, and processing times increase. This usually happens when there are composite hierarchies in the Data Hub. Below are some of the best practices to follow to avoid cluttering of data as follows:
- Avoid building composite hierarchies in the hub: One of the best practices is to build the hierarchies directly in the spoke models using saved views from the Data Hub (Figure 1.1, 1.2). This avoids unnecessary duplication of metadata. The lists should be kept flat in the Data Hub. One of the major reasons for building hierarchies is to analyze and validate the data. If a detailed analysis is required, hierarchies can be created and removed after validation is complete.
Figure 1.1 Example of L4 saved views which will be used for the spoke model
Figure 1.2 Example of L3 saved views which will be used for the spoke model
- Transactional code loading: Best practice is to load and store data using transactional lists. These lists contain unique IDs with a combination of what they are storing. In the example below, hospital, vaccine, and dosage are stored in the transaction. These are flat lists and typically contain a delimiter such as “_” or “|”. There will be three different modules involved. First one is used to parse out the dimensions from the transactional code member (Figure 2). The second is used to import the data against transactional codes (Figure 3). Lastly, the third module is fully dimensionalized (Figure 4). Many times, an index list is used if a transactional code cannot be produced. However, it is not recommended to use an index list to store data; loading data using an index list increases the model processing speed and takes up additional space.
Learn More From An Anaplan Expert
Not Establishing Model Building Standards
Anaplan does not have the functionality to show line items used for conditional formatting or filters. One issue many model builders face is during model clean up or removal of duplicate/redundant line items. These line items can be mistakenly deleted. Deleting a line item used for conditional formatting or filtering impacts the final views on the dashboard/pages which will cause re-work. This can be avoided by following a few standard naming practices:
- Creating separate filter modules: One of the best ways is to have all filters stored in a separate central system module. Each of these filters should be named according to the use case combined with the module/SV where it applied (Figure 5).
- Conditional formatting sections: Every module should have a staging section when conditional formatting is used. Line items in this section should begin with ‘CF:’ or ‘IND:’ and follow with the line item they are used to format. These line items must be used for conditional formatting in all modules/saved views. Further, the notes section of these line items can be leveraged to highlight the purpose of these line items (Figure 6).
Not Leveraging Syntax Correctly
Model Builders can run into mistakes leveraging syntax correctly. It is important to ensure not only does the syntax work for that one intersection but to do thorough testing to ensure the syntax works for all data points. In addition, just because the syntax is accepted by the Anaplan model does not mean it is being used appropriately. See below for some of the common syntax mistakes.
- Production Data and SELECT[ ] vs LOOKUPS: Model Builders learn in L1 and L2 training is to be careful when using the SELECT statement, this causes hard coding. The only instance using the SELECT state is permitted is with native versions, Time.All Periods and the top level of composite lists. If select is used in other instances and the list member it is referencing becomes a production list, Anaplan will throw an error. To avoid this, it is best to create a module to store all the list members required as list formatted line items. The module can be called CENTRAL LOOKUPS. These line items can then be used in LOOKUPs instead of a SELECT.
- Improper use of IFISACTUALVERSION() vs IF SYS Time < CURRENTPERIOD()
IF ISACTUALVERSION() is a function that can be used as a Boolean to return the actual values or forecast values based on the version’s switchover period. Note this is only applicable when the module has native versions applied to it.
IF SYS Time < CURRENTPERIOD() is an IF ELSE condition that returns values based on the current period of the model. Any period less than the current period of the module returns the values that is included in the condition. The important difference here is IFISACTUALVERSION() is dynamic to the versions switchover period (Figure 7) and Current Period is not dynamic to versions there is only one current period for the model (Figure 8)
- Early exit: It is always better to have the most probable condition as the first condition to avoid the entire backend calculation going through the IF ELSE loop. This enhances the model performance and increases efficiency. It also gives faster results and lessens the waiting time if there any many conditions. It is also a good practice to split the IF ELSE conditions into different line items for a faster backend calculation(if there are many).
- Improper use of IF THEN ELSE : There are times when multiple IF ELSE statements are written within a single formula instead of leveraging a mapping module and using a LOOKUP. This usually happens when there are different conditions to be written for every list item. It is always best to create a mapping module with the dimension for which the IF ELSE condition is to be written. In the mapping module, create a line item which maps this condition (either manual map or by formula). Use this line item in the final calculation module to look up the data and have the mapping set. This reduces the unnecessary loop and reduces the time taken to run the calculation.
We hope you found these pointers to be valuable. Remember to check out all the past Anaplan Live! sessions to learn more, no matter where you are in your model-building journey. Drop us a note below to let us know if this was helpful and what you’d like to see in a future deep dive.
Great work here @ayzoha ! Really got into the details and expanded on the topics @JaredDolich, @SonalTripathi, and I briefly touched on during the Anaplan Live session.0
Thank you, @Brett.Francis ! Wouldn't have been possible without your help and mentoring.1
Good article with pointers to keep in mind.
One question though : what is an index list as compared to a transactional list and how is it taking more memory space and calculation time ?0
@david.savarin Thank you pointing this out. The understanding here is that an index list any list with a top level set to it, which is used to stage data. This will add another additional run loop to calculate everything at the summary level as well, which reduces the processing speed.
Also, every saved view needs to be updated to "Select levels to show", to remove the top level, to avoid errors in the actions.
A good update to the post would be to explain what the index list here means. I will make that update.0 Kudos1
Great insights! Thanks for the info @ayzoha1