Performance optimization: Using a Subset of a List vs. Created a New List
Had an interesting question: Does it take more processing power in Anaplan to load up a small subset of a large list rather than an actual list with the same number of items, specifically from a dashboard performance perspective?
For example, if I have a list of 10,000 items and it has a subset that is 400 of the 10,000 items, would I be better off from a performance perspective just creating a new list with the 400 items? Or continuing to use the subset? Or it does not matter?
Answers
-
@otalpur ,
Performance should be the same as the subset is basically a different list. The real difference in your scenario would be the maintenance fo the two different lists vs. just the one.
Rob
0 -
Several things here to consider, which pretty much get us back to the Answer "It Depends"
In calculations, if you are having to get these back to a common dimension than the subset can natively do this by since there is a common dimension, whereas a separate list would have to use some kind of mapping to make them talk to each other. Subset wins this battel
I think the performance would be impacted the most by the import to populate the lists. One import including a Bool for a subset would be quicker than having to populate 2 independent lists since we know that imports are slower than just calculating data, and are blocking actions. Therefore, any time we can eliminate the need for an additional import, that would be best. A subset is basically a separate list that is maintained in the same place as the list its self is but functions just like a separate list. Subset wins this battle
In terms of size, each subset member is still taking the 500b (including the name and Code) just like a standard list member would. I think the only difference is that if you have properties on your list then it does not need to duplicate those. So if you have 1 property on your list that is 10,000 members long and your 400 member subset, then your total size would be as follows (I am assuming 1 property and it is a List which would be. 4 Bytes per cell, As well as the Bool for the subset which would be 1 byte each):
List Subset Total Bytes Per Total Total Bytes Required (Total x Bytes) List Members (Count) 10000 400 10400 500 5200000 Property (Cells) 10000 10000 4 40000 Subset (Bool) 10000 10000 1 10000 TOTAL
5250000 The option of using 2 lists would calculate to be this:
List 1 List 2 Total Bytes Per Total Total Bytes Required List (Members) 10000 400 10400 500 5200000 Property (Cells) 10000 400 10400 4 41600 5241600 These come out to be nearly identical in the grand scheme of things, but if have a ton of properties, (Which is obviously not a best practice) then having to duplicate all of those properties in the 2nd list would take additional size and performance for the import. This battle has no clear winner, so "It depends", but leaning towards subset.
So my recommendation would be to use a subset as it seems that the import, maintenance and ease of audibility would be easier for Anaplan and our brains! I think there are a lot of other things that you could bring into play here as well such as: Is this a one time import or are you running hourly integrations, or something in between, amongst other things.
Overall the performance impacts would likely be pretty minimal.
2