Do you want to know more about how predictive analytics can make Connected Planning easier? Our latest Ask Me Anything session with Deloitte's Nick Vandesype is NOW open! Post your questions!
Anaplan is a perfect enabler to make predictive analytics digestible and consumable. The real value of predictive analytics comes when you can integrate it with a planning solution. Hear more from Nick (@nickvandesype), Lead Predictive & Algorithmic Forecasting in Switzerland, on how predictive analytics can make Connected Planning easier.
How to participate in the Ask Me Anything segment:
1. Watch Nick's Ask Me Anything video on predictive analytics (video goes LIVE on Monday, April 27 at 8 a.m. SCT).
3. Post your questions in the comment section below the video.
4. Nick will be checking the page all week and answering your questions! The AMA will close at the end of the week on Friday, May 1 at 5 p.m. CST.
Re: AMA: Predictive Analytics in Connected Planning
Thanks for taking the time to do the AMA. Predictive Analytics is an exciting space and Anaplan is certainly a platform that complements and enables the work that Analytics teams are doing within organizations. I'm curious as to how you've seen businesses approach the integration of analytic insights into the planning process within Anaplan, and what challenges you've seen.
Anaplan brings process design much closer to the business user than any software I've seen in the space previously. It is allowing many folks who have traditionally spent most of their time on number crunching and data gymnastics, to actually participate in a creative process of system design. This can create real competitive advantages.
With that said, there's often a gap between the technology (analytics) and business process value. Without good design, the analytics get lost. And without good analytics, the decision-making is less informed. Are there any particular best practices you've seen that ensure Analytics are translated effectively into the planning process? With legacy planning systems, UX design was controlled by the software vendor. With Anaplan, businesses are given flexibility and ownership over design. How can organizations take advantage of this to get the most out of their analytics?
thanks for your question. You are making some very fair points. What we typically see is that companies have already onboarded data science teams the last years, which means most of the time the analytics know-how is already inside. The desk next to the data scientist, the planner/forecaster sits (from sales, finance, supply, HR, ...). The planner asks regularly: 'he data scientist, do you know the future better then I do?', where the data scientist says: 'no, but my machines can give you the most objective view on the future you can have, by looking at the past'. 'Cool, can you send me this objective view' asks the planner. The planner received an excel with some numbers by category, product or SKU.
Analytics here are not embedded in the process, the two teams are not aligned. The planner received a manual generated forecast, a black box with no clarity on how the data scientist came up with these numbers. He uses it as a starting point but changes the numbers in the input field because he believes for one or another reason they are over/underestimated.
You are right, Anaplan enables process design, and this also counts for the analytics process. In an ideal world, accuracy is measured (in Anaplan), a daily, weekly, or monthly interface with an analytical construct (in python, R, C++, ...) and new insights are updated regularly in Anaplan as a baseline, as an (objective) starting point. This baseline should include internal and external data, to maximize the accuracy of the baseline. After, planners can make changes, but changes (up or down) should be recorded and a reason should be added. At the end of the process, you're looking at the 'forecast value-added' (what was the extra accuracy the planner added by changing the baseline). If this is constant a positive number (which means the planner adds accuracy to the forecast) you should understand which data or insights this planner has which your machine does not include and start including those factors.
That's how we have very successfully embedded analytics in planning use cases.
1. provide a transparent baseline (and integrate this with an automated feed from your analytics engine)
2. include internal and external data
3. track changes and understand what the forecast value added is
Thank you for taking the time to facilitate an AMA. Predictive analytics is something really important to me, as I'm a big Anaplan advocate for the retail industry.
One area that constantly comes up for retail is predicting stock outs.
In your AMA introduction you mentioned that people invest a lot of time with the statistical modeling (perhaps in Anaplan) and sometimes miss the real advantage of Anaplan by using it as a collaborative planning tool. I agree with this assessment but that still leaves us with having to calculate a forecast.
Two part question:
Do you believe that we should keep our statistical modeling simple enough so it can be calculated in Anaplan or should we do our statistical modeling outside of Anaplan when the complexity is better suited using a programming language like R, Python or KNIME.
If you advocate statistical modeling outside of Anaplan what has been your experience about data integration? Any suggestions on how to architect Anaplan so the user can manage the automation themselves (i.e, regenerate the forecasts) so they can do scenario planning?
On the first point: the statistical models in Anaplan are only single regression models, from a statistical point of view, you need a more complex optimizer program to really be able to include multiple regression models. The big difference between both is easy: single regression models can only take 1 explanatory variable into account as in y=ax+b(+e) (which is the actual sales in this case), multiple regression models (the more traditional regression models used in econometrics) can include different explanatory data factors as in y=ax+by+cz+d(+e). The world is too complex to model them as single regression models, so I personally try to avoid these simple regression models. It can work, if you have very seasonalised, lineal growing products - but how often is that the case? Besides that, we also know that 85% of business performance is explained by external data - so excluding external data from predictive solutions is not ideal. So yes, I would go for a python or R solution.
On the second point: there are different possibilities which are described on the community page in more detail: Rest API's, Anaplan connect, or ETL solutions (like Informatica) can help to integrate your python script in Anaplan. Know you always need a place, a server to 'execute' the algorithm. In 95% of the projects I do, there is not a reason to have a 'self-serving' update of the predictive model needed. This means if you schedule the import and update of algorithms overnight (you can program this easily) this is sufficient. For the other 5%, we create a URL that is linked to the server and you click on the URL, the process (export data from anaplan, run the algorithm and send forecast back to anaplan) is triggered.
Thanks for hosting this AMA, such a great topic! Quick question for you, which types of predictive analytics do you think are best suited to be moved into Anaplan? Not just integrated into Connected Planning models, but actually migrated to being entirely calculated within Anaplan?
Re: AMA: Predictive Analytics in Connected Planning
Cool to have a question from you! Thanks for this!
I believe you're speaking about the different algorithms when you speak about types of predictive analytics? That is clear from my point of view, the most used time series models make sense to start with. We can use predictive analytics for all kinds of use-cases in Anaplan. In HR planning, churn predictive is very in, in a commercial area, the optimal price you can set for a certain customer/contract is very interesting to understand. But the most used use-case is what we call 'time-series' predictive analytics. Here there is a 'time' element that plays a role (e.g. it's important that the model recognizes that Feb comes after Jan and 2019 comes before 2020). These models can be used across different metrics (e.g. volume, prices, sales, margin, cost, EBIT, trucks, syringes, FTE's, ...). The most performing one is impossible to define, as it's different for every data set, but having the 5-10 most used ones natively integrated in Anaplan (like the optimizer) would make sense. I am thinking about: Arima, arimax, multiple regressions, vector autoregression, time-series gradient boosting, LSTM, Recurrent neural networks, ...
great to hear you are eager to get started with an easy predictive model.
You can get very easily started, but if you want to grow, some basic knowledge in a coding language (python, R, ...) will be helpful. So yes, some technical know-how would be needed, ideally in one of the two most used data science languages.
Understanding the drivers is definitely possible, but that's where a good model can be differentiated from a bad one. To give the easiest example in predictive analytics you can imagine multiple regression models.
You try to predict volumes (y) and you have therefore 4 data points which you want to use: 1) actual volume (a), 2) promotions (b), pipeline/order book (c) and the GDP of the country you operate in (d). If you create a multiple regression model you get something like: y = av + bw + cx + yd + standard error. The drivers' sensitivity is then explained with the factors v, w, x and y). Imagine that x is 0.5. If your 'c' or order book value increases with 1 your volumes (y) will increase with 0.5 (1 times 0.5).
This easily understandable approach counts for all predictive analytics models - so build it the right way, and you will understand which are the important drivers.
Feel free to clarify your question if this answer would not be enough.
Re: AMA: Predictive Analytics in Connected Planning
What are the alternative and preferred methods for "fuzzy matching" disjointed sets of data?
As I understand it, Predictive Analytics involves a blending of internal data -- such as financial sales / price / margin, customer satisfaction, CRM prospects and pipeline -- with third party data -- public financial statements, descriptive data sets, demographics, etc. This requires some means of determining when two items in separate systems are the same thing. That requires some degree of inference ... similar name, same postal code, same industry = 90% chance of match. How is that best accomplished?