We all know that Anaplan is allowing business users to make fact-based decisions quicker and faster than ever before, but we consistently hear from customers that their data quality is not up to par. “Garbage in, garbage out” or “dirty data” are phrases you’ve probably heard. I hate to be so blunt, but data will never be perfect. At Spaulding Ridge, we view data quality not as the end result but as an ongoing process. If you manage your data quality consistently and regularly, you will find that the overall cleanliness of the data improves—allowing you to better understand where gaps exist and to confidently rely on your data to make sound decisions. Here’s the good news; Anaplan allows you to manage data quality without having to spend months fixing source systems! We’ve taken advantage of this to develop processes and techniques in Anaplan to manage data quality—also known as Load, Validate, and Correct (LVC).
A key step in reducing data errors is ensuring the data fed to Anaplan is loaded correctly from the start. Follow these steps for each data import:
Use a data hub. When you stage data in a data hub, that data can be used by multiple Anaplan models and also ensures any data corrections are propagated to all Anaplan models.
Create a unique identifier (UID) list. These keys will dimension your data.
Import data based on code. Populate the code list with the unique keys in your data (e.g. customer IDs, product SKU numbers, cost center codes).
Import list first. Then, import data and properties of that list into a module that is dimensionalized by the UID list.
Ensure data formats are correct (e.g. text, date, number, decimals).
All imports (and processes) should produce all green checkmarks to help ensure there are no discrepancies later. Don’t settle for consistent warnings and errors in your process that you think do not matter. Someday, they might!
Rename or delete actions immediately after creating and running them. This helps keep the actions as clean as possible. Always rename using a consistent naming practice.
After data has been imported into a data hub, how do we go about validating the data? And what kind of information can we provide to business users to assist in this very important process? These are all important questions to address.
There are a Few Different Important Methods for Validating Data—Including:
To ensure that incoming data to Anaplan is complete and accurate, use exports of flat data.
Exception reports can be a helpful form of documentation. They will help business users understand how and where incomplete data will impact calculations within the model, as well as the impact it will have on final data and its usefulness.
Data validation dashboards are helpful to monitor data and proactively find issues. Make dashboards that allow users to understand the data and see ranges and graphical representations of data that make it easy to understand gaps in the data.
What are Best Practices When Approaching the Above Methods for Data Validations?
Business users should always be in control of auditing and validating data. Anaplan model builders should develop the dashboards and processes to manage quality, but ultimately, the business users need to review and validate the data they are using to make decisions.
Treat data quality as a user story in any new implementations or major builds. Dedicate time to review data quality and develop data-quality processes.
Confirming that the data within a data hub and model ties out to previously reported data is a critical component to the success of build and implementation. The early focus on data validation is very impactful to how thorough the final data is and how end users can utilize the integrated tool. End data—including information displayed on dashboards, as well as exported data—is only as useful as the original data is accurate, making it vitally important to place a focus on validation from day one.
One of the beauties of Anaplan is that you don't have to go back to source systems to correct data. Anaplan shines a light on data and exposes data-quality issues. Correcting the data should include the following techniques:
Create override line items where appropriate so users can override incorrect data values. Track these overrides in modules to understand the volume of overrides and reasons for them.
For consistent data issues, either create new modules (to map/correct other data) or build ETL processes in your data integration processes to correct data before it’s loaded (e.g., add leading 0’s, remove extraneous special characters).
Create a backlog of data quality issues to share back with owners of source systems. Have a monthly or quarterly meeting to review data quality issues and understand any changes in source systems to correct these issues or any risks of new issues as those systems are changed.
It is never too late to start building data quality management processes into your Anaplan solutions. Remember, LVC (Load, Validate, Correct) and data quality issues will not slow you down.
For more information or to discuss your organization’s current challenges around data quality, contact us at firstname.lastname@example.org.
Alyssa Fraelick, AssociateSpaulding Ridge, LLC Associate Alyssa Fraelick is an experienced Anaplan model builder. A lover of data, Alyssa strives to provide the utmost valuable solutions to her client to help bring them to the forefront of Connected Planning.