Calculation Waves

Hi Team,

 

I'm hoping to find further information regarding Calculation Waves in regards to best model performance & best practice

 

I have some models whereby the performance is not optimal & I think this maybe partly due to excess calculation waves. Is there any information regarding what effects waves have on performance?

 

As an example when you are staging data across a number of modules what impact does having an additional module to stage the data have?

 

Thanks

 

Mark

Best Answer

  • DavidSmith
    Answer ✓

    Echoing the above, in terms of building models, it is best to model formula in the most logical way

    Take a simple example:

     

    Revenue = Price * Sales Volume

    COGS = Cost Price * Sales Volume

    Profit = Revenue - COGS

    Margin % = Profit / Revenue *100

    So there are 3 "waves" to the calculation

    2020-08-17_12-36-52.png

    This is logical when looking at the formula and follows best practice to split formulae up into separate line items

    However, there are occasions when one of the line items might cause a blockage and prevent downstream calculations - These can often be when many elements of a formula are combined, especially having formulae as parameters in other functions.

    We are able to trace these blockages in our lab, so if you think there is a specific problem, contact your Business Partner who can help arrange this

    So, taking the example above, if we needed to, we could re-engineer the formula as follows to reduce the number of waves:

    2020-08-17_12-36-52.png

     

     This might reduce the calculation time and unblock the dependencies, however, it is not always clear, so as mentioned at the outside, I would not advise trying to pre-empt this when building formulae.  @MarkTurkenburg don't over think it.  The engine is complex and it will split tasks and utilise the processing power in the most efficient way.  The Planual is written to try and work with the engine as much as possible not against, so using best practice should, in the most part, lead to good performance.  

    I hope this helps

    David

Answers

  • @MarkTurkenburg 

    You ask some of the best questions. By calculation waves I assume you're referring to the D.A.G. or the Directed Acyclic Graph (aka the Hyperblock). @DavidSmith gives us a glimpse into this engine in his truth about sparsity, start with Myth#2. This, to me, has been a bit of a controversy because I feel calculation optimization conflicts with the PLANS methodology at times. You have the ability to use D.A.G. multi-processing but you have write your formulas so they aren't dependent on each other.

    For example:

    • Formula 1: C = A + B
    • Formula 2: D = C + E
    • Formula 3: D = A + B + E

    In this case, Formula 2 is serialized. "C" must calculate before "D" can calculate. Formula 3, however, can calculate at the same time as formula 1 because it's not dependent. 

    So by calculation waves, I believe it comes down to the D.A.G. calculating the dependencies and working through the calculations. Try reading @DavidSmith Sparsity article. Hopefully, that will help. Beyond that, we'll have to get David or one of the other pros to jump in. 

  • Thanks @JaredDolich 

     

    Yes it's an interesting one & I might be thinking too deeply about it but I remember during 2020 CPX there was a session on Best Practices & they spoke briefly about calculation waves and the effects on performance

     

    I guess I am trying to increase my understanding of the Hyperblock/best practices to determine whether I should be materially focusing on Waves when building a model or if function/formula composition, avoiding text/using Booleans etc & dimensional consistency (referencing modules with the same dimensionality) is more impactful to performance

     

    Thanks

  • @MarkTurkenburg 

    Yeah, I think you'll get no disagreement that Booleans are the way to go. Using system modules, where you do your calculations once and refer to them ongoing is also a HUGE performance boost. Nested IF statements should be broken up and if possible, always have the most likely outcome of the IF statement first so Anaplan can exit the formula as soon as possible. 

    To that end, the staging modules you refer to, double check that you don't have anything calculating on multiple dimensions that can be calculated on few dimensions.

    Lastly, I've only read this, so I don't know if this really helps or not but it sounds like the indexing of the dimensions is based on the order in the "applies to" column. Make sure the modules use the dimensions in the same order in the applies to column. So if you have PRODUCT, LOCATION in one module and LOCATION, PRODUCT in another, try to get them to line up the same. You will have to manually type them in the applies to column to get them to line up though (I learned this the hard way).

    Anyway, just some ramblings of things you probably already know. Hopefully, we can get some of the Hyperblock Pros to weigh in.

  • Agreed with all of those @JaredDolich 

     

    I guess I'm trying to determine if all of those standard best practice items that you mention have the same impact of reducing calculation waves

     

    I'm rambling too but just putting the thoughts out there 😁

     

    Mark 

  • @MarkTurkenburg 

     

    Glad that you are asking uncomfortable questions😀

    This is a very broad question which needs to be answered in multiple steps. Most of things @JaredDolich has already covered. But let’s understand it wrt the performance of the model. Here is the article by @Griffink which I call as Pure Gold. Just go through it, you will get to know loads of things.

    https://community.anaplan.com/t5/Blog/Lionpoint-Group-Enhanced-Anaplan-Model-Performance/ba-p/63465

    Feel free to post any further questions.

    Note: Go Slow while reading thru the article.

  • Thanks @Misbah!

     

    Funny but this session by @Griffink was exactly what I was referring to in my original post. I was in the audience for the session & remember being super impressed but it was also the 1st place I encountered the 'Waves' concept

     

    I'll work through it 👍

     

    Mark

  • I think the term you use as calculation wave is what jared refers as the algorithm that opens the model.

    In which case yes the "order" and references has a lot of importance, as stated in the planual rule don't daisy chain.

    Recently I worked on optimizing a model with 12K+ line items and the impact of small calculations can be important in the end depending where they are in the calculation chain.

     

  • 12k line items! @nathan_rudman  🤕

     

    I was actually referring to how the model performs in general, e.g. how fast it is to move around, open modules etc rather than the initial opening

  • I would also add that it is our aspiration to try and surface some form of metrics on line items so you can get some idea of the relative calculation and performance within the model itself

    David

  • Thanks @DavidSmith. I think the pertinent point there is 'don't over think it' 😂 As per usual I think I may of been getting too in the weeds but it's good to flush out the issue with the experts

     

    Appreciate the assistance provided by yourself and everyone else on this thread

     

    Mark