Module sparsity leading to size and performance impact. We are looking for alternative approach.
I am having issues with modules I created to analyze the planning results. Due to the high sparsity of the data model, I get a lot of cell combinations, most of them empty but leading to high memory and performance impact. I will explain the model setup that leads to the issue, hoping someone knows how to enable the analysis without having the impact performance.
The model is used to assign Sales Representatives to Customers and Product Groups in the form of:
Customer A, Product Group 1 --> Sales Rep X
Customer A, Product Group 2 --> Sales Rep Y
Customer B, Product Group 1 --> Sales Rep X
Customer B, Product Group 2 --> Sales Rep Z
and so on.
The module 1 for the planning therefore applies to the lists Customer and Product Group and has the Sales Rep as a line item.
So far, so good. Now after the assignments have been done, we want to give the planners the possibility to show the workload per Sales Rep in the form of:
Sales Rep X, Customer A, Product Group 1 --> Sales Value, Count
Sales Rep X, Customer B, Product Group 1 --> Sales value, Count
Sales Rep Y, Customer A, Product Group 2 --> Sales value, Count
Sales Rep Z, Customer B, Product Group 2 --> Sales value, Count
and so on.
To enable this analysis, we had to create a module 2 which applies to the lists Sales Rep, Customer and Product Group and has the invoiced value as well as the Count (whether this combination is assigned into Module 1). Then we filtered the output of this module to all entries which have a count of greater than 0.
However, this leads to module 2 generating a lot of combinations which are empty, because of course not all Reps are serving all customers and not all custoers are offering all products.
In the example above, we have the following line counts:
Module 1: 2 Customers * 2 Products = 4 Cells
Module 2: 2 Customers * 2 Products * 2 Sales Reps = 8 Cells
Now in our case we have a lot of customers and Reps and also some more product groups, leading to the following numbers:
Module 1: 74 Mio. Cells
Module 2: 3,700 Mio. Cells
The model is now at about 5 billion cells in total, consuming about 20 GB mostly due to the high sparsity module 2 described above. This is dangerously close to the internal limits of Anaplan and also impacts performance.
I currently do not see a way to enable the analysis without having a module which multiplies all my lists producing high sparsity. Therfore I reach out to the community for alternative ideas.
Thanks for any suggestions!