How I Built It: Waterfall calculation

edited May 2023 in Blog

We come across the need for waterfall calculations more often than we think! I’ve personally created this at least four times in different contexts, and I’ve helped colleagues who were trying to achieve something where a waterfall calculation would work perfectly well.

First, let’s clarify what I mean by “waterfall calculation” — as I have seen this with many names. It is not related to waterfall charts or reconciling figures. What we are talking about is spreading figures across time for different cohorts; I call it a waterfall due to the shape that the data forms — in a diagonal line (which you will see in the video).

For my example in this article, let’s work with a simple case: client attrition. The company will have a process to determine their expectation of new client acquisition every month, but how do we calculate how many active clients we will have every month? This will be a combination of existing clients that haven’t churned, as well as new clients acquired, and the retention expectation based on their aging.

We start with a standard attrition curve for clients. The business stakeholders could provide this information and input into Anaplan. It would look something like this:

This shows the general expected behavior from new clients. You start with 100% of the clients you onboarded within that month, and you expect that 80% will continue in business next month, while 20% will churn. Then in the following month, only 75% of those original clients remain, and another 5% churn, and so forth. For this example, we are looking at a range of 12 months of expected attrition.

Separately, you also have a forecast of how many new clients you expect to onboard each month. Let’s start with only three months of data:

The waterfall calculation would enable you to calculate how many active clients you will have each month, considering all new onboarded clients from the three cohorts (acquired in January, February, and March).

How did I build it?

Now, this only covers the new clients acquired. Your use case may allow you to take a simple approach of a standard churn estimation for your existing client base, and then the waterfall calculation results could be added on top of your base as they come out of this demo.

Often though, you may be required to integrate older cohorts into this calculation. You would want to apply the attrition curve to older cohorts that are mid-way through their client lifecycle. You might also want to leverage the trends of recent cohort behavior to estimate the attrition curve for new cohorts.

Here’s how I built that (taking it one step further):

Of course, this is a simple example, but we can take the same concept and expand it to cover many other use cases. Some other examples I’ve come across:

  • Insurance companies reducing risk factors based on the client’s last insurance activation. A customer who has been accident-free for 24 months would have lower risk factors considered than a customer who activated the insurance 12 months ago.
  • Travel products converting the date of booking into the date of travel. For all the clients that book their travel this month, some will travel within the same month, others will travel next month, and others will travel six months from now. The company will need to bridge the gap between the booking and the travel dates to estimate revenue accurately.
  • Installation of purchased products. A product sold requires a technician to install the product at the client’s property, creating a lag between the moment of purchase and the moment of installation when the revenue will be recognized.

Some use cases may require you to expand this model technically as well. I kept it simple on my demo with a count +0 to +12, and so the number on the item itself tells the formula how much the offset should be. You might be required to set up this “count” dimension differently, for example, by having months in there as well (so you would have July cohort spreading in July, August, September, … instead of +0, +1, +2…). This means you are creating a matrix between real-time and a “fake” time that replaces our basic count list. The same concept applies though, and you will just need to set up a module to create that code dynamically for every fake month.

First, create a line item that identifies the starting fake month of each real month (fake July is count +0 for July, fake August is count +0 for August, etc). Another line item will then add codes for all other fake months (+1, +2, +3…) in reference to the dynamic +0 of the first line item. Everything else should work the same, and you can use these dynamic settings to customize your visualization reports as well.

Leave a comment on this post to share your experiences with this sort of calculation and what you call it (if not waterfall calculation!).