Planual Explained - Day 3
"Rule 1.06-02": Article 1, Chapter 6, and Rule 2 “Don't use subsets on large lists.” It is better to create a list on its own if the Subset is more than 75% of the list. This goes against “Performance” of PLANS if you wish to create subsets on large lists
Here is how it was done in Pre Planual Era: Without checking the size of the lists we used to create subsets thinking it saves space and helps in model optimization. Little did we know that there can be a performance hit because of such large subset and at the same time with no space savings. For Example List A with 10,000,000 transactions having a subset which has got 75% occupancy, subset used to be created thinking it saves space for 25% of list items.
What is wrong with this method? First we need to understand what subsets really are? Subsets are essentially the lists within lists. List Subset items consume space as much as List items do (which is 500 bytes per item) even if that list or subset is NOT being used as a dimension in any module. When a large list with top level which has got one subset in it is being used in modules it impacts the Performance because the system has to aggregate the data not only for the lists but also for the subsets and re-aggregate in all those modules where this particular list and subsets are being used as dimensions. Performance also takes a hit when you add or remove subset items from such lists
Also there is a myth that ALL subsets help in space optimization. That is not true. Here is my analysis on it
A List with 10,000,000 List items in it will contain 5,000,000,000 Bytes of space which is roughly equal to 4.7GB. If we add a subset to this list which has got 75% occupancy of the Original list meaning the subset will have atleast 7,500,000 list items in it and will consume additional 3,750,000,000 bytes of space which is roughly equal to 3.5GB. List which was originally consuming 4.7GB space is now consuming 8.2GB Space (4.7GB from Original list and 3.5GB from Subset). Model builders have to take a judicious call on this whether that subset can save 3.5GB in due course of model building which in turn will depend on how many times that subset will be referenced and on how many intersections. Let’s see what happens when this list and/or subset is being used as a dimension in any module.
If List Used
If Subset Used
Diff (In MB)
Line Item 1
Line item 2
Line item 3
Line item 4
Line Item 5
Note: Based on Simple module having a single dimension
As you can see using subset in a module saved 70MB of a space for 5 line items. This subset has to save 3.5GB of a space to Breakeven which in turn will depend on the number of times this subset is being dimensionalized by line items/modules
Here is how it should be done in Planual Way: Create a different list altogether instead of a subset for large lists.
It has got many benefits such as
- System will not have to aggregate the data for List and Subset at the same time and for modules.
- Only one list will be impacted upon import