Simple and Effective Way to Present Forecast Evaluation Metrics
Predictive forecasting tools like PlanIQ can create incredible opportunities for businesses. If we want the outputs of a forecast model to create value, it's critical that we evaluate the model and present information about its performance in an elegant way that is easy for planners to understand.
I spent some time putting together a simple dashboard that is both information-rich and easy to consume.
I've attached a video walkthrough along with some thoughts. Let me know what you think!
I understand PlanIQ will do cross validation based on (historical) Data Collections used to train the model (and hence we get all those forecast accuracy metrics once the model fitting is done). As the dashboard shows dynamic metrics based on evaluation horizon, I am not sure about the timing we can do this analysis.
- Does it mean that we need to hold some data and exclude them from the Data Collections and build the MAPE calculation manually in Anaplan module outside PlanIQ?
- As the forecast results generated by PlanIQ are the out-of-sample forecast, does it mean that we can only do this evaluation horizon metrics analysis after the actuals are available, so for 6-week evaluation horizon, we need to wait for another 6 weeks for the actuals to be available.
Thanks for the question. My typical approach is to always seed the Anaplan model with a history of out-of-sample forecast results (which is what you see in that dashboard). I have actuals available through the end of Q2 2021 but I ran PlanIQ with only actuals through the end of Q1 2021. This makes it easy for me to compare the forecasts that would have been generated by PlanIQ a the end of Q1 2021, to what actually happened in Q2 2021.
I will typically run through this backtesting exercise to capture historical forecasts across many different forecast periods so we can start to see how consistently accurate any model has been at producing forecasts over time.
In the future I'll share some additional ideals about calculating evaluation metrics across different forecast periods as well (lags).