Predictive Insights (PI) - Use Cases and Model Types

NatalieC
edited June 2023 in Best Practices

Predictive Insights (PI) supports multiple types of account scoring models. Each model type requires different input data depending on the use case. Configure PI by filtering and adjusting the accounts in the Customer and Prospect datasets.

Use this page to understand the most common use cases supported by PI and what input data is required.

Important Notes:

This article is intended to provide more guidance for the input data recommendations for your specific use case. It does not change the general technical data requirements for every model.

To build a model, Predictive Insights requires a minimum of 100 unique Customer and 500 unique Prospects. For more details see technical data requirements in Anapedia.

When describing “Customers” and “Prospects” in these cases we are referring to the accounts that have purchased or not yet purchased according to the goal of your use case, regardless of whether they may be customers of your business in general. Use the “Is Customer” line item in the Anaplan input data module to code accounts according to the desired outcome.

Customer Acquisition

The most common use case is new customer acquisition, also called “greenfield expansion”. The purpose of this type of model is to score the entire existing non-customer Prospect data set, plus any future list buys or strategic account list. We recommend building this type of model before expanding to more complex use cases, especially if there is not already a general account scoring model in place.

One consideration is that this type of model may not yield the best results for sub-segments of the Prospect data base. If the business has distinct customer segments that buy different types of products, then multiple models or different types of models may provide more accurate results. However, this model type will be applicable for most general account scoring use cases.

Data Requirements: Use “Is Customer” line item to code

“Customer Data Set”:

All Customer Accounts that have recently purchased from you.

What is considered “recent” may depend on your business sale cycle.

“Prospect Data Set”:

All early-stage prospects, non-customer accounts.

Notes:

Consider excluding churned customers, lost and late-stage opportunities from either list.

High Value and Upsell

Another common goal is to target accounts that are most likely to yield the highest ACV or other metric that corresponds to account value, such as deal size or estimated ARR. Use the most relevant first-party data to identify high ACV accounts or accounts that have been upsold, for example from an entry to a premium tier. We recommend a preliminary analysis if the business does not already have a definition of “High ACV,” for example looking at the distribution of current customer annual value. The choice of which metric to use may affect the performance of the model. Examples of potential metrics include estimated customer lifetime value (LTV), annual recurring revenue (ARR), or another metric relevant to your business.

Consider that this type of model will only be applicable to generating opportunities from potential high-ACV accounts. Smaller accounts may still be very likely to become customers, but they would score poorly in a high ACV model.

Data Requirements: Use “Is Customer” line item to code

“Customer Data Set”:

High-ACV: High Value Customer Accounts that have recently purchased from you.

Upsell: High Value Customer Accounts that grew from lower value accounts over time.

“Prospect Data Set”:

High-ACV: All early-stage prospects, non-customer accounts.

Upsell: Existing Customer Accounts that are below high-ACV threshold

Notes:

Consider excluding churned customers, lost and late-stage opportunities from either list.

Product-Specific and Cross-Sell

Often business sell multiple types of products and services into multiple customer segments. Therefore, either a product-specific Customer Acquisition model or a Cross-Sell model may be appropriate. Either case requires multiple PI models, one per each product that is sold.

For either case, use the most relevant first-party data to identify Customer purchase history. The ideal cross-sell data set requires the identification of a subset of customers that cross-purchased a second product after first purchasing other products. If the historical data does not exist, you can still build a Product-Specific model that should capture the profile of all customers that purchased that product, but it may not be as well calibrated for the cross-sell use case.

Product-specific models are used to score accounts based on how similar they are to customers currently using a product, regardless of whether that was the first purchase or if the account was later cross-sold. If the goal is to sell a product to greenfield accounts, then you should include these non-customer accounts in the input data set. If the goal is instead to sell to existing customers, the input data set should be limited to customer accounts including those that have not yet purchased the specific product.

As was the case with High ACV model, Cross-Sell and Product-Specific models should only be used for the purpose it was built for. These types of models may produce low scores for accounts that are still a good fit for the business overall, just not for the specific product.

Data Requirements: Use “Is Customer” line item to code

“Customer Data Set”:

Product-specific: Current customers using a specific product.

Cross-Sell: Current customers that purchased a specific product after purchasing another product, i.e., customers that expanded to the given product.

“Prospect Data Set”:

Product-specific: All current non-customer prospect accounts and/or customers that have not yet bought the given product, solution, or LOB.

Cross-Sell: All current customers that have not yet bought the given product, solution, or LOB.

Notes:

Consider excluding churned or lapsed customers, lost, and late-stage opportunities from either list.

Regional and other Segmentation

Finally, for any of the cases described above, it may be useful to produce multiple PI estimates across different geographies and segments. This is especially true if different regions or segments have different customer profiles. Similarly, if one region or segment dominates the Customer data set but not the Prospect data set, results may be skewed in regions/segments outside of that primary market. In this case, your database may be large enough that there are enough Customer records in each region and segment to produce standalone models.

Any of the above model types could be optimized for a given region or segment by filtering both the Customer and Prospect data sets. As with other model types, as the target Customer base becomes more specific, the generalizability of the model decreases, so the model should only be used to score accounts in the region and sub-segment for which it was designed.

Data Requirements: Use “Is Customer” line item to code

Specific to the use case, see model types described above.

Be sure to apply the same region or segment filters to both the Customer and the Prospect Data Set.

Notes:

Consider excluding churned or lapsed customers, lost, and late-stage opportunities from either list.

Be sure to exclude accounts outside of the region or segment of interest.