Module sparsity leading to size and performance impact. We are looking for alternative approach.
Dear community,
I am having issues with modules I created to analyze the planning results. Due to the high sparsity of the data model, I get a lot of cell combinations, most of them empty but leading to high memory and performance impact. I will explain the model setup that leads to the issue, hoping someone knows how to enable the analysis without having the impact performance.
The model is used to assign Sales Representatives to Customers and Product Groups in the form of:
Customer A, Product Group 1 --> Sales Rep X
Customer A, Product Group 2 --> Sales Rep Y
Customer B, Product Group 1 --> Sales Rep X
Customer B, Product Group 2 --> Sales Rep Z
and so on.
The module 1 for the planning therefore applies to the lists Customer and Product Group and has the Sales Rep as a line item.
So far, so good. Now after the assignments have been done, we want to give the planners the possibility to show the workload per Sales Rep in the form of:
Sales Rep X, Customer A, Product Group 1 --> Sales Value, Count
Sales Rep X, Customer B, Product Group 1 --> Sales value, Count
Sales Rep Y, Customer A, Product Group 2 --> Sales value, Count
Sales Rep Z, Customer B, Product Group 2 --> Sales value, Count
and so on.
To enable this analysis, we had to create a module 2 which applies to the lists Sales Rep, Customer and Product Group and has the invoiced value as well as the Count (whether this combination is assigned into Module 1). Then we filtered the output of this module to all entries which have a count of greater than 0.
However, this leads to module 2 generating a lot of combinations which are empty, because of course not all Reps are serving all customers and not all custoers are offering all products.
In the example above, we have the following line counts:
Module 1: 2 Customers * 2 Products = 4 Cells
Module 2: 2 Customers * 2 Products * 2 Sales Reps = 8 Cells
Now in our case we have a lot of customers and Reps and also some more product groups, leading to the following numbers:
Module 1: 74 Mio. Cells
Module 2: 3,700 Mio. Cells
The model is now at about 5 billion cells in total, consuming about 20 GB mostly due to the high sparsity module 2 described above. This is dangerously close to the internal limits of Anaplan and also impacts performance.
I currently do not see a way to enable the analysis without having a module which multiplies all my lists producing high sparsity. Therfore I reach out to the community for alternative ideas.
Thanks for any suggestions!
Answers
-
The best way to acheive this would be to use Numbered Lists, for example you could merge Customers and Products into a single list. The list should only contain the required combinations of Products and customers and Sales Rep could be a parent in the hierarchy. There is plenty of reading material around use of numbered lists but for initial creation:
Create a new list called #Customer-Product. In Settings > General Lists check the numbered list check box for that list. When you add list items it will generate unique names for these list items starting from #1 you can overide these names with a Display Name which could be your customer-product combination. Add a list property called Display Name and make it TEXT format. After the numbered list check box you will be able to select the property Display Name to rename the list items in your list. You can create the list by loading a CSV file with the Customer-Product combinations you require into the list and have a Sales Rep as Parent and the name that you want to see in the list as Display Name mapped to the Display Name. This will then create the items in the list that you require. You could also add properties for Customer and Product if you want to populate these.
Alternatively, you can also create a formula to pick up the Display Name from a module set up for this purpose if required.
For more indepth information about numbered lists: https://help.anaplan.com/anapedia/Content/Modeling/Build%20Models/Lists_and_Roll-ups/Numbered_Lists/Working_with_Numbered_Lists.html?_ga=2.89475099.1648931499.1546961984-837077636.1546961984#Create
Hope this helps!
M
0 -
It would be best to look at using a numbered list for your Customer > Product combination. The Sales Rep could even be a parent of this list in the hierarchy. Numbered lists are used for precisely this situation because you use the list to create only the valid combinations required for Customer > Product. It seems that you could probably create this list or create a process to create this list from Module 1 as seems to have your valid combinations.
It is probably best to do some reading around using Numbered lists but the basic steps for setting one up are:
- Go to Settings > General Lists and create a new list called #Customer - Product then in General lists check the box for numbered list against the new list.
- When you create items in a numbered list the names will appear as #1, #2 etc these are the unique names in the list for each list item.
- You can the choose to give the list item a Display Name which can be the Customer - Product combination. The unique name is still held for the list but the name that is seen is decided by you.
- To Create a Display Name create a new property of the list called Display Name and give make it Text format. After the check box for numbered lists in the General List you can select this to use as your Display Name. You can then add the name you want to appear in the Display Name column in the Grid view of the list.
- Alternatively you could create a module dimensioned by your new numbered list and create the Customer > Product naming convention in there and then link the Display Name in the list properties via a formula.
- If I understand your set up correctly you could probably create an action to populate your numbered list from your Module 1.
Community and Anapedia are both good sources of information on numbered lists, here are some of the useful links:
1 -
Thanks for the suggestion. I already thought about this and I am also using Numbered Lists a lot in other models. The reason why I did not use it here is because in the current setup, module 2 is automatically updated whenever an assignment in module 1 is changed. With your solution, I would have to set up a process that either runs automatically each defined timeframe or give the users the possibility to execute it themselves after they performed the changes. I was hoping to keep this from them. But if this is the only way to keep the module size at an acceptable level, I will have to evaluate this for the next target setting round.
0 -
Maybe you can do a combination of what you already have as well as what @Michelle Sutherland stated by having the Products role up to Customers in a numbered list (which is defined by the porducts the customer actually has) and then dimensionalize that by the assignments. In doing this, you will reduce the sparsity of not having all products intersecting all customers because not every customer will own every product. So, instead of having all 3 lists in the dimension, you will only have the two. This way, you will still have the assignment "reporting" be dynamic, but still lowering the overall size of the model/module.
Hope this helps,
Rob
1 -
Yes, I thought about this as a first step since the product group / customer assignment is quite stable. I think we would then loose the functionality of automatically filtering when selecting a customer or product group from the numbered list or being able to use customer or product group as a page selector. But this might be ok since we could work around this topic.
The bigger driver for the sparsity issue is the relation Customer / Sales Rep. We have 112,000 customers and 900 Reps. Each customer has 1 - 3 Sales Reps assigned only which would lead worst case to 300 k combinations but in the current setup already to more than 99,000 k.
But as written before, reducing the Product Group / Customer combinations will also make an impact and might on its own save us 1/3 of the current cell combinations.
0