Performance Comparison - Direct SUM vs Line Item Subset
Hi all,
Was looking to get some community feedback on two different SUM approaches and which one would be more efficient from a model engine point of view.
The scenario,
I'm looking to SUM line items into a central P&L by account. The source module lines do not have account as a list, so we'll be using something similar to [SUM: Mapping Module.Account] to map the line items to an account.
I will stress that this is more for examples where there's potentially a dozen or more lines to SUM.
The options,
Option 1. Directly SUM line item from source module to central P&L, e.g.
- Line 1[SUM: Mapping Module.Account] + Line 2[SUM: Mapping Module.Acount] + Line 3[SUM: Mapping Module.Account] + Line 4[SUM: Mapping Module.Account] .. etc
Option 2. Create an LIS with the relevant lines, setup a COLLECT() & mapping module using the LIS and then SUM into the central P&L using a single function, e.g.
- COLLECT Source[SUM: Collect Module Mapping.Account]
The outcome,
Option 1 is the most simplistic approach as the SUM is direct from source to target, but will result in potentially 10+ SUM parameters in the logic as there's so many line items to include.
Option 2 is a bit more complicated to setup initially, requiring an LIS and two additional modules (one to use COLLECT() AND one to map each line to an account), but results in a much cleaner calculation outcome once it's included in the central P&L (as well as arguably better auditability and flexibility).
Whilst we can debate the different pillars of PLANS, I'm looking more to the P for this particular query. Looking forward to hearing your thoughts on the different approaches (or alternatives if appropriate).
Cheers,
L
Best Answer
-
Thank you for this post and question…Short answer is absolutely Option 2 as HyperBlock actually works less but faster as there is only "one" sum formula getting the data from the collect() statement vs. Option 1 there are several sum formulas.
Funny thing, I just went over this exact scenario in the ACE training in San Diego a couple of weeks ago in a before and after scenario.
in "your" Option 1, you would have something similar to this
where the vast majority of "detailed" line items have the same formula:
The much better way is your Option 2 where the same line items are now:
Remember, the Collect() and Calc module, keep your sums OFF so there will be less calculations and aggregations. This reporting module should have SUMs turned on for the aggregations.
The Calc module is defined as the following with the LISS in the rows.
Hope this helps,
Rob
2
Answers
-
Hi @luke_e,
I think you also need a LIS with the 1st option to build the mapping module, right? But, just that you would replace the COLLECT() by some SUM functions.
I would say that the 2nd option is more powerful (almost always better when avoiding complex and long formulae), more sustainable (the performance would be even worst if you add line items in the source module) and easier to audit indeed. You can visit this article around LIS, the section "Transformation" is a similar case as the one you are facing.
Hope this helps!
0 -
@BenjaminNiel Thanks for the response, I'll have a squiz at the link. I can see a lot of use cases for an LIS but a lot of our models were built out prior to LIS functionality being introduced, so it's just a case of sussing out the best ones.
In my option 1, we didn't need an LIS as we had a mapping module with each account as a line item (with a format of list:account), and we'd just do a SUM using the relevant lines.
@rob_marshall Your example was pretty much right on with what I'm dealing with, albeit mine is much worse as far as the number of [SUM: X] parametres in each line, hah.
I guess the other benefit with option 2 is that there's no calculation change required if additional line items became in scope for the central P&L; we'd just need to tick a box, deploy and map the account.
Appreciate the detailed response and glad to know the COLLECT() avenue is the better approach.
Cheers,
L
1