I’m forecasting monthly demand with a focus on a rolling 6-month horizon (out of an 18-month view), using PlanIQ in Anaplan. My forecast uses 7 models: ARIMA, Prophet, ETS, MVLR, Ensemble, DeepAR, and CNN-QR. Each month, I perform 6-month backtesting and calculate MAPE for each model, then select the best-fit forecast using 1- MAPE. Until last month, I used an outlier correction method that automatically capped values using upper and lower bounds. However, this often cut off real business growth signals. So, I: * Removed all outlier correction from the past 7 years of historical actuals * Recreated the PlanIQ data collections and models using this new history After doing this, I expected higher forecasts, since peaks are now kept. But instead, I’m seeing: * Lower forecasts * In some cases, flat or even decreasing trends * Even though the business is growing in certain regions and customers, and we’ve seen strong seasonal growth recently Some ideas I considered: * Models may be slow to recognize new patterns and will catch up after a few months * Changing the historical data changed the statistical properties (mean, variance, skewness), which affected model behavior * PlanIQ might have retained some internal state based on the previous (truncated) data, causing inconsistency Additional context: * No single model dominates; best fit varies across customers and regions * I have a manual override process for model selection, but it’s time-consuming and hard to scale * I don’t segment customers by behavior (e.g., growing vs. steady-state) * We recreate data collections and forecast models each time we run the forecast My questions: * When I change historical data and recreate the collections, does PlanIQ retrain models from scratch on the new data? * Is there a way to tune model flexibility (e.g., to prevent underfitting or excessive smoothing) in PlanIQ? * Why would removing outlier correction (i.e., restoring peaks) result in lower forecasts, even when using MAPE-weighted model selection? * Is the 1-MAPE formula causing this unexpected behaviour? Should I use 1/MAPE? I am interested in accuracy for big customers. Would love to hear if anyone has experienced something similar in PlanIQ or has thoughts on how to better handle this situation.

PlanIQ - Unexpected Forecast Drop After Removing Outlier Correction

I’m forecasting monthly demand with a focus on a rolling 6-month horizon (out of an 18-month view), using PlanIQ in Anaplan. My forecast uses 7 models: ARIMA, Prophet, ETS, MVLR, Ensemble, DeepAR, and CNN-QR.

Each month, I perform 6-month backtesting and calculate MAPE for each model, then select the best-fit forecast using 1- MAPE.

Until last month, I used an outlier correction method that automatically capped values using upper and lower bounds. However, this often cut off real business growth signals. So, I:

Removed all outlier correction from the past 7 years of historical actuals
Recreated the PlanIQ data collections and models using this new history

After doing this, I expected higher forecasts, since peaks are now kept.
But instead, I’m seeing:

Lower forecasts
In some cases, flat or even decreasing trends
Even though the business is growing in certain regions and customers, and we’ve seen strong seasonal growth recently

Some ideas I considered:

Models may be slow to recognize new patterns and will catch up after a few months
Changing the historical data changed the statistical properties (mean, variance, skewness), which affected model behavior
PlanIQ might have retained some internal state based on the previous (truncated) data, causing inconsistency

Additional context:

No single model dominates; best fit varies across customers and regions
I have a manual override process for model selection, but it’s time-consuming and hard to scale
I don’t segment customers by behavior (e.g., growing vs. steady-state)
We recreate data collections and forecast models each time we run the forecast

My questions:

When I change historical data and recreate the collections, does PlanIQ retrain models from scratch on the new data?
Is there a way to tune model flexibility (e.g., to prevent underfitting or excessive smoothing) in PlanIQ?
Why would removing outlier correction (i.e., restoring peaks) result in lower forecasts, even when using MAPE-weighted model selection?
Is the 1-MAPE formula causing this unexpected behaviour? Should I use 1/MAPE? I am interested in accuracy for big customers.

Would love to hear if anyone has experienced something similar in PlanIQ or has thoughts on how to better handle this situation.

Accepted answers

All comments

taranjassi

Hi Seyma,

"After doing this, I expected higher forecasts, since peaks are now kept.
But instead, I’m seeing:

Lower forecasts
In some cases, flat or even decreasing trends
Even though the business is growing in certain regions and customers, and we’ve seen strong seasonal growth recently"

I'm curious to know whether this happened for all forecasts, or only a select few of the ones you normally generate predictions with?

My thoughts are that due to including the outliers, increases and decreases are inflated, and therefore the trends and seasonality have fundamentally changed in the historical actuals. This could cause the weights associated with particular algorithms to change, or become more negative. As a result, PlanIQ could infer a more significant 'drop' due to the inclusion of outliers and generate a prediction which flat-lines or is lower than expected.

I know this can be the case in LSTM models and generally in forecasting, where outliers or 'demand spikes' can distort normalisation of values when preparing data for forecasts, resulting in 'noiser' than normal predictions.

Keen to hear thoughts from others on this and if anyone's had similar experiences!

seymatas1

Hi @taranjassi,

Thank you for your response.

I'm curious to know whether this happened for all forecasts, or only a select few of the ones you normally generate predictions with?

It happened in almost all big customers and regions. There is a bigger drop from actuals when there is a more clear upward trend.

I know this can be the case in LSTM models

DeepAR, however, is not the most commonly selected best-fit model. It’s one of the least selected for large customers and regions.

Is the 1-MAPE formula causing this unexpected behaviour?

I found an answer to this question.

This needs to be evaluated algorithm by algorithm, since we use 7 algorithms and plan to add 3 more.

I also have a scheduled call with the PlanIQ team. I hope they can provide more clarity on this.

Seyma 😊🌷

Quick Links

Can you help?

Unanswered questions

How to use Client ID and Client Secret for API request directly?
Hi all, The Client ID and Client Secret are available, intended for uploading and downloading files. Could anyone advise how to use these credentials DIRECTLY to make API requests? Thank you!
12_ [COMPLETE GUIDE] Uphold Support Number
At Uphold customer support ⭐+1-888-355-2348 accounts may be temporarily disabled or restricted for various reasons. Raise a concern on priority basis at Uphold support ⭐+1-888-355-2348 if you notice any suspected malicious activity, or a problem during account recovery. To contact Uphold support directly at…
New UX Filtering for Hierarchy sibling
I have A6 as the parent list and two sibling lists: A6.5 and A7. The A7 list includes an attribute that maps to A6.5. In the UX, the user will select A6.5 using a content selector, and the grid (dimensioned by A7) should display only the A7 items associated with that selected A6.5. Since the selector is a page context and…