AMA: Predictive Analytics in Connected Planning
Do you want to know more about how predictive analytics can make Connected Planning easier? Our latest Ask Me Anything session with Deloitte's Nick Vandesype is NOW open! Post your questions!
Anaplan is a perfect enabler to make predictive analytics digestible and consumable. The real value of predictive analytics comes when you can integrate it with a planning solution. Hear more from Nick (@nickvandesype), Lead Predictive & Algorithmic Forecasting in Switzerland, on how predictive analytics can make Connected Planning easier.
How to participate in the Ask Me Anything segment:
1. Watch Nick's Ask Me Anything video on predictive analytics (video goes LIVE on Monday, April 27 at 8 a.m. SCT).
3. Post your questions in the comment section below the video.
4. Nick will be checking the page all week and answering your questions! The AMA will close at the end of the week on Friday, May 1 at 5 p.m. CST.
Comments

Nick,
Thanks for taking the time to do the AMA. Predictive Analytics is an exciting space and Anaplan is certainly a platform that complements and enables the work that Analytics teams are doing within organizations. I'm curious as to how you've seen businesses approach the integration of analytic insights into the planning process within Anaplan, and what challenges you've seen.
Anaplan brings process design much closer to the business user than any software I've seen in the space previously. It is allowing many folks who have traditionally spent most of their time on number crunching and data gymnastics, to actually participate in a creative process of system design. This can create real competitive advantages.
With that said, there's often a gap between the technology (analytics) and business process value. Without good design, the analytics get lost. And without good analytics, the decisionmaking is less informed. Are there any particular best practices you've seen that ensure Analytics are translated effectively into the planning process? With legacy planning systems, UX design was controlled by the software vendor. With Anaplan, businesses are given flexibility and ownership over design. How can organizations take advantage of this to get the most out of their analytics?
1 
Thank you for taking the time to facilitate an AMA. Predictive analytics is something really important to me, as I'm a big Anaplan advocate for the retail industry.
One area that constantly comes up for retail is predicting stock outs.
In your AMA introduction you mentioned that people invest a lot of time with the statistical modeling (perhaps in Anaplan) and sometimes miss the real advantage of Anaplan by using it as a collaborative planning tool. I agree with this assessment but that still leaves us with having to calculate a forecast.
Two part question:
 Do you believe that we should keep our statistical modeling simple enough so it can be calculated in Anaplan or should we do our statistical modeling outside of Anaplan when the complexity is better suited using a programming language like R, Python or KNIME.
 If you advocate statistical modeling outside of Anaplan what has been your experience about data integration? Any suggestions on how to architect Anaplan so the user can manage the automation themselves (i.e, regenerate the forecasts) so they can do scenario planning?
Thank you @nickvandesype! Oh, and a shout out to @YelenaKibasova for planning this event!
6 
Hey @nickvandesype,
Thanks for hosting this AMA, such a great topic! Quick question for you, which types of predictive analytics do you think are best suited to be moved into Anaplan? Not just integrated into Connected Planning models, but actually migrated to being entirely calculated within Anaplan?
Thanks!
4 
Thank you @nickvandesype for doing this AMA.
If I wanted to go ahead and implement some "simple" predictive analytics, how would I go about it?
 What technical knowhow do I need?
 Do I need programming skills?
 Can we understand the drivers of the prediction?
It would be good if you could talk about your experience implementing that with some of your customers.
2 
What are the alternative and preferred methods for "fuzzy matching" disjointed sets of data?
As I understand it, Predictive Analytics involves a blending of internal data  such as financial sales / price / margin, customer satisfaction, CRM prospects and pipeline  with third party data  public financial statements, descriptive data sets, demographics, etc. This requires some means of determining when two items in separate systems are the same thing. That requires some degree of inference ... similar name, same postal code, same industry = 90% chance of match. How is that best accomplished?2 
Hi @timothybrennan ,
thanks for your question. You are making some very fair points. What we typically see is that companies have already onboarded data science teams the last years, which means most of the time the analytics knowhow is already inside. The desk next to the data scientist, the planner/forecaster sits (from sales, finance, supply, HR, ...). The planner asks regularly: 'he data scientist, do you know the future better then I do?', where the data scientist says: 'no, but my machines can give you the most objective view on the future you can have, by looking at the past'. 'Cool, can you send me this objective view' asks the planner. The planner received an excel with some numbers by category, product or SKU.
Analytics here are not embedded in the process, the two teams are not aligned. The planner received a manual generated forecast, a black box with no clarity on how the data scientist came up with these numbers. He uses it as a starting point but changes the numbers in the input field because he believes for one or another reason they are over/underestimated.
You are right, Anaplan enables process design, and this also counts for the analytics process. In an ideal world, accuracy is measured (in Anaplan), a daily, weekly, or monthly interface with an analytical construct (in python, R, C++, ...) and new insights are updated regularly in Anaplan as a baseline, as an (objective) starting point. This baseline should include internal and external data, to maximize the accuracy of the baseline. After, planners can make changes, but changes (up or down) should be recorded and a reason should be added. At the end of the process, you're looking at the 'forecast valueadded' (what was the extra accuracy the planner added by changing the baseline). If this is constant a positive number (which means the planner adds accuracy to the forecast) you should understand which data or insights this planner has which your machine does not include and start including those factors.
That's how we have very successfully embedded analytics in planning use cases.
1. provide a transparent baseline (and integrate this with an automated feed from your analytics engine)
2. include internal and external data
3. track changes and understand what the forecast value added is
4. optimize the algorithm
1 
Hi @JaredDolich
Thanks for your question.
On the first point: the statistical models in Anaplan are only single regression models, from a statistical point of view, you need a more complex optimizer program to really be able to include multiple regression models. The big difference between both is easy: single regression models can only take 1 explanatory variable into account as in y=ax+b(+e) (which is the actual sales in this case), multiple regression models (the more traditional regression models used in econometrics) can include different explanatory data factors as in y=ax+by+cz+d(+e). The world is too complex to model them as single regression models, so I personally try to avoid these simple regression models. It can work, if you have very seasonalised, lineal growing products  but how often is that the case? Besides that, we also know that 85% of business performance is explained by external data  so excluding external data from predictive solutions is not ideal. So yes, I would go for a python or R solution.
On the second point: there are different possibilities which are described on the community page in more detail: Rest API's, Anaplan connect, or ETL solutions (like Informatica) can help to integrate your python script in Anaplan. Know you always need a place, a server to 'execute' the algorithm. In 95% of the projects I do, there is not a reason to have a 'selfserving' update of the predictive model needed. This means if you schedule the import and update of algorithms overnight (you can program this easily) this is sufficient. For the other 5%, we create a URL that is linked to the server and you click on the URL, the process (export data from anaplan, run the algorithm and send forecast back to anaplan) is triggered.
Hope this helps, if not, let me know!
1 
Hi Chris,
Cool to have a question from you! Thanks for this!
I believe you're speaking about the different algorithms when you speak about types of predictive analytics? That is clear from my point of view, the most used time series models make sense to start with. We can use predictive analytics for all kinds of usecases in Anaplan. In HR planning, churn predictive is very in, in a commercial area, the optimal price you can set for a certain customer/contract is very interesting to understand. But the most used usecase is what we call 'timeseries' predictive analytics. Here there is a 'time' element that plays a role (e.g. it's important that the model recognizes that Feb comes after Jan and 2019 comes before 2020). These models can be used across different metrics (e.g. volume, prices, sales, margin, cost, EBIT, trucks, syringes, FTE's, ...). The most performing one is impossible to define, as it's different for every data set, but having the 510 most used ones natively integrated in Anaplan (like the optimizer) would make sense. I am thinking about: Arima, arimax, multiple regressions, vector autoregression, timeseries gradient boosting, LSTM, Recurrent neural networks, ...
Hope this helps!
2 
Hi @fabien.junod ,
great to hear you are eager to get started with an easy predictive model.
You can get very easily started, but if you want to grow, some basic knowledge in a coding language (python, R, ...) will be helpful. So yes, some technical knowhow would be needed, ideally in one of the two most used data science languages.
Understanding the drivers is definitely possible, but that's where a good model can be differentiated from a bad one. To give the easiest example in predictive analytics you can imagine multiple regression models.
You try to predict volumes (y) and you have therefore 4 data points which you want to use: 1) actual volume (a), 2) promotions (b), pipeline/order book (c) and the GDP of the country you operate in (d). If you create a multiple regression model you get something like: y = av + bw + cx + yd + standard error. The drivers' sensitivity is then explained with the factors v, w, x and y). Imagine that x is 0.5. If your 'c' or order book value increases with 1 your volumes (y) will increase with 0.5 (1 times 0.5).
This easily understandable approach counts for all predictive analytics models  so build it the right way, and you will understand which are the important drivers.
Feel free to clarify your question if this answer would not be enough.
thanks,
0 
Hi @hendersonmj ,
thanks for your question  good point! One of the most important topics (where btw data scientists also spend most of their time on) is cleansing data. Once you start merging data from different data sources, you see this issue more than when you stay in one data set. It depends a bit on the magnitude of the data  and how complex your model is, but part of the script of the algorithm is exactly about this process. There also exist packages in R and Python which can help you doing such data cleaning. I always tell to my team: that data format is not aligned between data sources is not an issue, it's a different story when the format is not consistent. Data scientists will write rules to merge, transform, split, ... data to be able to match formats. Every time that this algorithm is triggered, the computer will go through this recipe and perform the same cleaning rules. it would be too nice when an algorithm is only about data mining and pattern recognition.
I hope this helps!
0 
Hi @nickvandesype,
Thanks for hosting this interesting session.
Would you be able to better articulate a few examples of predictive analytics applied to a commercial use case ?
If possible could you also confirm the recommended approach/options for implementing it to an existing large and complex project?
I am a big fan of small POCs, would this sort of approach work? Is there a free prebuilt template that I could refer to?
My understanding is that it would be a challenge for someone with no previous experience/skills in this area to be able to explore/start delivering something. Particularly if the time to invest is very limited.
Is it correct? Any particular tips?
Thanks a lot,
Alessio
1 
Sorry for the delayed answer.
In commercial use, predictive analytics can be applied for different use cases. E.g. territory and quota management  by having more accurate predictions, based on data  you can set better quota's and assign more insightful to different territories. But of course, also in sales forecasting, pricing use cases are algorithms helpful. I did a few projects where predictive analytics was used to understand what the dynamic price elasticity of products in different regions for different customer segments was. By knowing this, you can drive more effective promotions.
There is not 'approach' i can recommend. Every project is different, every client infrastructure is different, so hard to come with a recommended approach. But starting with a PoC is definitely a good start. This is what I also do to give people an idea of how well such an approach can work. There are many courses online which you can follow to get started with is. What is sure is that it's not because you're a master anaplaner you know anything from predictive analytics. It's a different skill set you need, a different toolbox you need to understand and a different way of thinking. I believe that the role of us  master anaplanners  is to collaborate very closely with the data scientists in our organization and that we need to help them making their insights (via predictive analytics) more digestible for our planners, our endusers.
0 
Hi Nick
Great to have AMA for Predictive Analytics. I really can't wait to start using it in Anaplan!
Just to expand a little Chris'es question: do you think it makes sense to also enable some sort of scripting language inside Predictive Analytics in Anaplan? I reffer to the options that are available for example in tools like SAS Enterprise Miner or RapidMiner where you can not only use the predefined blocks with algorithms that are enabled by default but also can create a custom algorithm using well known programming languages like Python or R.
I'm asking cause Data Science is a state of art and playing around with all the attributes of the algorithm is inevitable and crucial if you want to achive best predictions. But from the other hand I don't know how much of details is too much for business analytics where they maybe are more interested in live forecasting  to have the model refitted to data it usually takes a lot of time...
Looking forward to read your opinion.
0