Performance Comparison - OFFSET/LAG/PREVIOUS/LOOKUP

Hi all,

 

Was looking to get some input on a calculation scenario as community pages were a bit vague (or I'm using the wrong keywords).

 

Now, I'm sure most of us understand how Anaplan thrives on being multi-threaded where possible, but when it comes to performing time-based calculations, there's a variety of ways to do the same thing.

 

For example, when attempting to retrieve a value from a previous period, you can use,

 

OFFSET(Source Line, -1)

LAG(Source Line, 1, 0)

PREVIOUS(Source Line)

Source Line[LOOKUP:Previous Month Lookup]

 

I'm conscious that in some cases, particularly when the line item is a circular reference (i.e. referencing a previous period of itself), the LOOKUP option isn't possible.

 

My understanding is that with exception of the LOOKUP method, all other calculations were serial-type calculations which would infer significantly slower performance when used repeatedly (versus LOOKUP), but I was curious whether there's any actual bench-marking into how efficient (or rather, inefficient) each version is.

 

Reason for my question is that our team is currently undertaking a review of our models/model performance and are attempting to optimise where possible.

 

Cheers,

L

Best Answer

Answers

  • Hello @luke_e 

     

    I have heard PREVIOUS is better in performance compared to other time functions. I think you can try to stick to time functions if it is on seeded TIME dimension and ignore LOOKUP.

     

    Thanks

    Arun

  • @luke_e 

    It's a good question and from a performance perspective, I will find out.

    Before I do that though, I will share some thoughts

     

    The PLANS standard has some core principles, Simple is better than complex being one of them.  Also although the P stands for Performance, the L and A also stand for Logical and Auditable

     

    So, for me, what is the most logical answer to the requirement "I need the previous period".  Answer PREVIOUS()

    It is much easier to understand what the formula is doing and it says so!!

     

    OFFSET and LAG are very flexible and can deliver the same result, as you say, but why over complicate things.

     

    Also, As @ArunManickam mentioned, I suspect LOOKUP will be the worst performing; again why make the engine work hard.  Using a lookup for each period is likely to mean a lot more work as potentially the value could be any time period, whereas PREVIOUS = the Prior period.

     

    Anyway, I will run some tests and post the results here

    David

  • Thanks for the responses,

     

    Definitely an advocate of making things simpler, but in our current situation, we're erring more on the side of performance as our current model needs a bit of tuning.

     

    I'm also surprised that LOOKUP would be the worst performer. 

     

    My understanding is that using the PREVIOUS function (or the alternatives) effectively predicates a dependency on the previous cell, meaning it wouldn't be able to process a calculation in parallel, whereas the LOOKUP would.

     

    Also, I imagine Anaplan can interpret/detect which cells are dirty resulting only in the impacted cells being updated when using a LOOKUP, versus the entire PREVIOUS string being re-looped (if that makes sense).

     

    @DavidSmith, if you're able to shed some light once you've had a chance to test, it'd be greatly appreciated.

     

    Cheers,

    L

  • That is some impressive turn around time.

     

    In the context of your comparison, it's actually quite alarming the difference in calculation time between PREVIOUS and LOOKUP by the time we're hitting 5 years, particularly as our model has a 15 year timescale.

     

    From our perspective, I think our next logical step would be to export our Line Items and review the number of instances where we've applied the LOOKUP variant, amend, compare two models (pre & post change) and go from there.

     

    Depending on how we go, I'll see if I can provide some before/after benchmarks to determine (at least anecdotally) what sort of improvement others could expect by making similar changes.

     

    Thank you again @DavidSmith; was exactly the analysis I was after.

     

    Cheers,

    L

  • @luke_e 

    No problem, Part of the evolution of PLANS and the Planual is to give definitive guidance where we can!!

     

    Once, we start hitting long timescales, the performance implications do start to kick in!!.  As @ArunManickam  alluded to, this is another example of where PREVIOUS is efficient compared to CUMULATE for timescales when you start going beyond 1-2 years; but also can depend on the inter-relationships, so often not quite "Black and White"

     

    As mentioned, previously, there are many ways we can model in Anaplan and my favour saying is "Just because you can, doesn't mean you should" - It is a core principle of PLANS.

     

    So my advice to all modellers is look at the problem, ask yourself "what am I trying to achieve" and then look for the the most logical solution

     

    LOOKUP, OFFSET and LAG all have functionality designed for different purposes and are very flexible.  They can also provide the same result in different ways.  So the important thing is to find the simplest and logical solution; it is often the most performant too!!

     

    Happy modelling

    David

     

  • Anyone knows how using previous would stack? In case we need to look at 3-5 periods back, would previous then still be the most efficient? So previous(previous(previous(data)))?

    Would this still perform better than example offset(data, 3)?

    I would argue that it would be more auditable to use offset, as it can quickly become problematic to sit and count previous, but how would the performance turn out? Does anyone have experience with this?

    Mobo