Data Hubs: Purpose and peak performance

You may have heard about a model called a Data Hub, but perhaps you aren’t confident that you understand the fundamentals, primary functions, or considerations when architecting one. There are three main advantages to incorporating a Data Hub:

  1. Single source of truth: Stores all transactional data from the source system.
  2. Data validations: Ensures all data is correct and valid before the data gets to the spoke model(s).
  3. Performance: It is always faster to load data from a model rather than a file.

Additionally, the administrator can ensure the correct granularity of data in the spoke model(s) when using a Data Hub. For example, the source system may only contain transactional data at the daily level, but the planners may need the data aggregated to the month. The Data Hub can summarize the data and export only the data needed.

The following information is designed to further define a Data Hub and support you in your journey of building your own.

Table of contents

Definition of the Data Hub

First, we need to define what a Data Hub is. This can be split into four sections:

  1. Use cases: The Data Hub should be the first model built, whether you have a single use or multiple use cases. The data should be automatically refreshed on a schedule, whether it is nightly, weekly, monthly, etc., from the source system—often an Enterprise Data Warehouse (EDW). All modules and views that create hierarchies or lists should be stored in the Data Hub, which enables your models in having one version of truth, as well as reducing the duplication of data.
  2. Model connectivity: Anaplan Connect, one of our 3rd party vendors (Informatica Cloud, Dell Boomi, Mulesoft, or SnapLogic), or our REST API can be used to automate the loading of data to the Data Hub from the source system, as well as transferring data from the Data Hub to the spoke model(s). Additionally, transitional data should not be loaded directly into the spoke module, especially if there is a large volume of data.
  3. Functions: Often, simple ETL (Extract, Transform, and Load) functions can be utilized within your Data Hub to transform the data for the spoke model(s). This is helpful when consolidating data from multiple sources where you have different “codes” and need a mapping module to ensure the correct data gets mapped correctly.
  4. Team: The management of the Data Hub should have a designated team of experts who understand what data is stored in the Data Hub (to ensure duplication doesn’t happen), as well as the how and when the data gets loaded.

Anaplan architecture with a Data Hub

There are several ways your Anaplan architecture could look, depending on the number of workspaces you currently have and the type of security your company requires. The following are illustrations of common architectures.

Master hub model: across workspaces

Master Hub Model 1.jpg

The most common, and recommended, architecture is when the Data Hub is in its own workspace. Not only does this have the advantage of not interfering with the other models, but it also adds an additional security layer, with a segregation of duties. In this view, the Anaplan Workspace Admin(s) can limit the access to the Data Hub workspace to only the people who require it.

Master hub model: within a workspace

Master Hub Model 2.jpg

The simplest depiction is where your Data Hub is within the same workspace as your spoke models. While this can be accomplished, it is not best practice as there is no segregation of duties and there is a possibility, upon heavy loads from the source system, of performance issues. Additionally, when adding users, the Anaplan Workspace Administrator (Admin) would need to ensure users don’t have access to the Data Hub, as well as any users of the Data Hub not having access to the spoke models

Multiple Data Hubs


Finally, the Data Hub doesn’t necessarily have to be the only model in the workspace. You can have additional Data Hubs, if needed.

Factors to consider when implementing a Data Hub

There are six main elements to think about when architecting a Data Hub:

  1. User stories
  2. Source systems
  3. Lists
  4. Modules
  5. Data validation
  6. Exporting data to spoke model(s)

User stories

One of the cornerstones of The Anaplan Way is data (process, model, and deployment being the others), which is critical to all implementations. You will need to know what data is needed for a certain use case. Consider the following, common, data questions that need to be answered in order to be successful:

  • What granularity of the data is needed?
  • How much history is needed? How much history do you have?
  • Does the source system only have transactional data, but the use case needs the data at the month level? Can the source system do the aggregation for you?

After all data questions have been answered, shift your focus to the source system and consider the following:

  • Consider the source system. Where is the data coming from? What is the source system, and is it a trusted environment? Is it Excel? Typically, you should stay away from Excel as the “source” because Excel cannot be audited.
  • Define the data source owners. Who has access to this data? Who is preparing it? Are they part of the project? These are often-overlooked questions that are critical to success. Ideally, the data source owners need to be part of the project from the start to understand the file specifications and prepare the initial load of the data, as well as towards the end of the project to do a final load of the data.
  • Define file specifications. How many files will be needed? Typically, you will need master data, as well as transactional data. Instead of having one file with all of this data, determine if the data can be split between different files (one for transactional, one for the unique members of the master data). It will be better for Anaplan (for performance reasons) to split these to reduce warnings during the data load process.
  • Analyze the data. Understand what makes each record unique (date/period and transactional amounts should not be part of this), and make sure the data owners don’t give you everything (Select * From Employee) when you only need five columns. Remember, it is better to ask for additional columns midway through the project than getting all columns in the beginning and only using a select amount.
  • Consider custom codes in the source system. Find more on this in the transactional lists section. This is a great trick for transactional data. After you have analyzed the data to understand what makes each record/row unique, concatenate the “codes” of the metadata into one transactional code, but remember, you will need to be under the 60-character threshold.
  • Define the schedule. When is the data available? Is the data on a certain schedule? What is the schedule required with this use case?
  • Determine the ETL medium to be used. Will Anaplan Connect be sufficient, will one of our 3rd parties be used, or will a more custom application be needed, such as REST API? Does your company already have this experience inhouse, or will training be required? These will need to be factored into all data stories.


Usually, the largest lists are those containing transactional data. There can be millions of transactional ID’s with several list properties defined. First, properties should not be defined on a transactional list (or any list, except for Display Name, as they do get accounted for in the workspace memory). Secondly, instead of loading metadata to list properties (Cost Center and Account as properties), try to figure out a way to incorporate them into the code. If the transactional data is defining a transactional amount at the intersection of Cost Center and Account for a particular month, attempt to use the code of the Cost Center and the code of the Account concatenated together (0100_57000). Not only will this decrease your list size, but it will also create a healthier model.

In the below example, the model builder did not create a custom code, but rather used a combination of properties to make the record unique, which included the date/period, as well as the transactional amount. Notice the original number of records vs. the number of records after a custom code was created.

Transactional 1.png

By incorporating the date/time period, as well as the transactional amount, it inflated the list size exponentially based on the number of years that were loaded. Doing this not only caused the model to be bigger, but also caused poor model opening performance.  See the Appendix for a simple worked example to explain further.

Learn more about sparsity in the two-part series The Truth about Sparsity: Part 1 and The Truth About Sparsity: Part 2.

Flat lists

Similar to transactional lists, flat lists are not part of a hierarchy and are a series of records grouped in a list, like Products, Companies, Cost Centers, or Employees. These are your “legends” or “anchor” for all metadata about this unique record. Again, the only property that should be defined is a Display Name, if needed. It is best practice, from a model builders’ perspective, to suffix the name with “Flat” or “- Flat”. This helps identify whether the list is part of a hierarchy or flat list (Employee – Flat, Cost Center – Flat, Product – Flat). These lists can be used for data validation, which will be described later in this article.


Ideally, you should have three types of modules in the Data Hub:

  1. Transactional: A Transactional module will store the transactional data by the time series, whether that be by day, week, month, quarter, or year. The only data, or line items, should be transactional data. No other line items should be defined. Additionally, to keep the size down, make sure the summaries on the line items are turned off, or None, as there is no reason to sum the data within the module.
  2. System: System (SYS) modules, or the “S” in DISCO, do not have time associated with them and should only be dimensionalized by the same list (Employee Flat, Cost Center Flat, Product Flat). These modules store the metadata or attributes about the list item that doesn’t change over time, for example the employee’s start date. Another example of a SYS module would be any kind of mapping that is required, whether it be SYS Time Filter module or a mapping from one source system to another.
  3. Export modules: If the data from the source system is being loaded at a lower granularity than needed in the spoke model(s), export modules can aggregate the data to the specified need (month, quarter, or year level), which will lead to more efficient data load performance to the spoke model(s). Additionally, it is better to only load the granularity of data needed instead of loading all data to the spoke model, but only using a portion of it.

Loading data vs. using formulas in SYS modules

If you can devise a custom code where all of the attributes of the data are accounted for, you can greatly increase the performance of your data load, especially on very large data volumes. It is actually faster to use formulas to derive the data from the custom code than it is to load the data. Why? A couple of reasons. First, when data is loaded, the load is triggering the change log, and every change is being recorded in the model history. Second, loading data to another module is an additional action. If you didn’t need that action, you would save processing time.

In the example below, the exact same data was loaded four different ways:

  • Import properties to a list: A list was created with all attributes, including the transactional data, and was loaded to list properties (not best practice and against DISCO).
  • Import to list, attribute, and trans: A list was created, the transactional data was loaded to a transactional module, and all of the attributes were loaded to a SYS Attribute module.
  • Import to list, trans, calculate attribute: A list was created, the transactional data was loaded to a transactional module, but the SYS Attribute model was calculated using two different methods:
    • One line item: Using FINDITEM() with several functions parsing out the information from within the FINDITEM(). For example, FINDITEM(Cost Center, RIGHT(LEFT(Trans Details.Code, '2nd Group’), 3)).
    • Multiple line items: Parsing of the member spread across multiple line items and using FINDITEM() with only the list and code as the parameter. First, you do the parsing to get the correct piece of the code (one line item), and then the FINDITEM() of that code (2nd line item).

Load performance

Load Performance 1.png

Notice, the best performing data load was the last one, Import to List, Trans, Calculate Attribute (multiple line items), where the parsing out of the data was spread over multiple line items. This is due to the fact that the data load was able to take advantage of Anaplan’s multithreading capabilities. The worst performing data load occurred when data was loaded to the Attribute module because, due to the sheer size of the data, a save had to be performed.

Exporting to spoke models

One of the most important concepts to remember when exporting data is to use a view from a module. Lists should not be exported because you lose control over what you export. It is either all or nothing. By using views, you can employ a filter (should always be a Boolean) to render exactly which data needs to be exported. If you need more than one filter, combine both into one line item and use that line as the filter. You will have much better performance if you are only using one Boolean line item as a filter vs. having multiple filters defined.

Another important concept to remember is to only export detailed information, as there is no reason to export parent information (quarter, year, etc.). Not only will you get warnings when exporting parent information, but the performance of the export will decrease because the system will have to create a debug log. The goal is to make sure a debug log is not created, all green checks, so if there ever is an issue, you will know it truly is an issue that needs attention.

Line items in the Data Hub formatted as text should not be exported as text, but actually as list formatted line items in the spoke model (text->list formatted line item). The goal is to reduce the number of text formatted line items in the spoke model.

Some say they need to do validation in the spoke model, therefore they need to import the data as text. Actually, this is false, because the validation should have already been done in the Data Hub, so there should be no need to do the validation again.

Lastly, you should think about what really needs to be exported. Do you really need to export historical data that hasn’t been changed? Instead, just export the newly loaded data, or delta data. This can be accomplished by using one of two methods:

  1. From the source system, request IT to only send the updated information, not the full load every time. Additionally, request IT to create a column in the source file with a hardcoded value of “TRUE.” This will tell Anaplan which row is new or has been updated and can be used as a filter for an export. Just know, before the import of the source data gets loaded, make sure the first action within the process clears out the previous true records (set this up via a view using a filter where the view only shows members with a value of true).
  2. Utilize the current period function to only export the current period data. In the SYS Time Filter module, create a line item named Current Period with the formula CURRENTPERIODSTART(). In the export views, filter the data on this line item.

Tips and tricks

A few of tips and tricks to be aware of include the following:

  • Hierarchies should not be in the Data Hub
  • Analytical modules should not be in the Data Hub
  • Do not delete and reload lists
  • Data Validations Model

Why should hierarchies not be in the Data Hub? To answer that question, you need to understand why hierarchies are used in the first place. Essentially, hierarchies are only needed to aggregate data for analytical purposes, and since users will not normally login to the Data Hub, the lists essentially take up space. With that said, it is perfectly okay to create the hierarchies for testing purposes to ensure your actions from the meta modules are building the hierarchies correctly, but as soon as the actions are working correctly and have been verified, you can remove the list structures from the Data Hub. A case can be made that certain implementations may need the hierarchies created in the Data Hub for validation purposes of several sources. If this is the case in your implementation, just be sure to only use the hierarchies for validation purposes.

In addition to the above, there are two more reasons to not have hierarchies built in the Data Hub—cluttered data, and spoke models that pull data from the lists.  

Data Hubs need to be clean and clutter free to ensure optimal performance, which also makes it easier for the administrators to understand exactly what data is stored in the Data Hub. Additionally, when you have lists—especially hierarchical lists—spoke model builders will sometimes build their lists from the lists within the Data Hub instead of from a view. It is best practice to always build lists from views from within a module so the action can benefit from filters (there are no filters when importing from lists).

Analytical modules should not be in the Data Hub since end users don’t normally access the Data Hub. There really isn’t a reason to have products by versions by time in the Data Hub, that belongs to the spoke model. Remember, the Data Hub should only be used to store data from the source system(s).

Within your nightly data load process, do not delete and reload data, including the list structures. If you have a proper code, you shouldn’t need to do this. Additionally, not only does this impact the overall performance of the process (adding an additional action to delete the list, which then deletes all data associated with that list), but the process is essentially filling up the change log with the exact same data that it had before the delete. When a certain threshold is surpassed, the model will require a save, thus taking up even more time. Ultimately, you are forcing the model to re-aggregate all of the data, instead of just the new data.

Lastly, if you know you will have to do a lot of transformations on your data (consolidating multiple source systems or your data is not clean), think about creating a Data Validations model.  This model’s sole purpose would be to clean the data and then feed the data to the Data Hub, thus keeping the transformations to a minimum in the Data Hub as well as keeping the Data Hub clean.

Worked example

Use Case: Transaction data is by store and SKU and month

Bad way

  • The code for the Transaction list is a three-part code Store_SKU_Month
  • Attributes for Store, SKU and Month are imported as Text and matched against the Store list, SKU list and Time period respectively
  • An additional line item is needed for the Store and SKU code (for export).

This is the screenshot of the bad way:

Worked Example Bad 1.pngNotice the repetition of the attributes. STR07 and SKU031 are repeated each month.

Good way

  • Two data files
    • Unique combinations of Store and SKU (two-part code)
    • Store SKU code by month for the quantity
  • The transaction details are stored in a module dimensioned by Transactions
  • The Store and SKU attributes are calculated using the “_” delimiter
  • The quantity is stored in a module dimensioned Transactions and by month
  • The additional line item is needed for the Store and SKU code (for export). This is a subsidiary view in the module as it is not dimensionalized by Time.

These are the screenshots of the good way:

Worked Example Good 1.png
Worked Example Good 2.png

Below lists out the breakdown of the model in terms of List size, Line items and the associated member usage of the various structures. The main reasons for the improvement are because lists themselves account for approximately 500b for each member and also there is repetition of the attributes per “month” in the transaction data (as mentioned above).

DataHub Testing.jpg

Hopefully, this article has shed some light on Data Hubs, how they should be used, and what you can do to ensure they perform at their peak level. Remember, analyze the data to understand what makes the row unique and use that as the code. Every list should have a code—every list!

Additional resources


The content in this article has not been evaluated for all Anaplan implementations and may not be recommended for your specific situation.
Please consult your internal administrators prior to applying any of the ideas or steps in this article.

Hi @rob_marshall  - question on list hierarchies. For a store network have the Store, Region, Division. So would have three lists and then a module dimensioned on Store with Region / Division as line items that would be updated via import from source and then this would flow into the spoke models as a hierarchied list starting at the top and going down?

Hey @andrewtye,

Yes, ideally you would like for Store, Region, and Division to be flat lists in the data hub and the transactional data coming into a module with the "transactional" key/list to be a concatenated with the members to make the row unique. 


Then, you would have a "properties" module defining all master data for that unique row.


Lastly, you would have views defined on the properties module to create the hierarchy in the spoke model.

Hope this helps,


Thanks @rob_marshall !

Hi @rob_marshall

This is a fantastic article, thanks for sharing.

When you did your analysis regarding "Loading data vs using Formulas", was file type accounted for?

I know zipped files are loaded into Anaplan much more quickly than normal csv/txt if they are large and just wondered if this was included in processing time.

Also, it would be good to have some clarity over "upload" time and "processing" time, I'm not sure if there is any distinction between them in an upload process but I would assume so, and I presume this article focuses on the processing time?



@CallumW ,

Thanks for reading it, I am glad you liked it.  Regarding the file types, this was not part of the article nor the actual loading of a file into Anaplan, more of loading the data to the line items when the file had already been uploaded vs. using a formula to determine the data (from the code).  Does that help and clarify?



Appreciate the quick reply @rob_marshall .

I guess my question would then be whether there is any processing benefit in using different file types when loading into line items, or is there a process that converts to "Anaplan language" upon upload and therefore before loading into line items, thus no benefit?

i.e. the only benefit of loading different file types would be seen on the upload stage before Anaplan loads into line items?

Apologies for the confusion!


In asking around (Ben Speight), there is no difference in uploading a csv file vs. a text file and there is no difference in the reading of the data from the file and importing them to line items.  Now, if you are using Anaplan Connect, I believe the data gets zipped on the way into Anaplan but if you are using a browser, the data does not get compressed.

Hope this helps,


Appreciate the follow-up, @rob_marshall !

@rob_marshall I love this post.

I just finished loaded a years worth of transactions and tested both ways, the bad way and the good way.

The good way was 82% smaller and 90%+ faster. AMAZING!

Hope you include this best practice in the level 3 cert training.

@JaredDolich  - That is fantastic.  It is amazing what the differences are and how easy it is to get them.

This was such a helpful article. Thank you!

Any opinions on best practices when the source system is an Anaplan spoke, rather than an external system? For example, the annual budget is developed in Anaplan using a Budget spoke. While there are transaction modules in the Budget spoke for user data entry, there are multiple calculations performed on that data to arrive at the final budget (ex: headcount based expenses). The end result of the Budget spoke is a dimensionalized module, not a transactional module. There are other reporting spokes that need summarized, dimensionalized budget data (ex: for variance analysis).

Since the data in the Budget spoke is already dimensionalized, it doesn't make sense to go to great lengths to flatten it to get it into the HUB, only to need it dimensionalized in other reporting spokes. Is this an example in which it makes sense to have a module with more than just the time dimension in the Hub? Or maybe I'm missing something big here. I'm not looking for a perfect answer, just more of a general guidance around when a spoke serves as a data source. Perhaps it doesn't flow through the hub at all?

@nicole.johnson ,


Great question...There is no need to flatten in the spoke (Budget), just bring in the data that is needed (aggregated at the correct level) to the target spoke model.  Also, I would not advise taking the data back to the HUB from the Budget model to the spoke model (Budget -> Hub -> Spoke), just get the latest and greatest data directly from the Budget model.  If you do go to the HUB first, there is a chance the spoke misses an important update in the Budget model that hasn't made it to the Hub yet.


Hope this helps,



@rob_marshall , @nicole.johnson 

I do agree that there is no need to flatten the data. Less data you move around is better for performance.

On the point about pushing the data through the HUB I just want to bring one more thing - it's always good to have single point of thruth for you data. Possibly you will be using the data in the future to feed other model (that is possibly not yet there in the environment) and then it will be easier to use HUB simply rather than redesign data flow, I guess. Moreover if you schedule your imports between models carefully you won't end up with an issue of missing updates.


@PiotrWeremczuk ,


If there is a pre-defined schedule, you are correct.  With that said, often times folks need to pull the data on an ad hoc basis using the same actions/process for a quick update.  If you are having to waiting on the data going to the Hub, it may be missed or not there yet.



@PiotrWeremczuk @rob_marshall 

Wouldn't it be a good idea to allow both? setting up an action that pulls meta data from both the HUB (automated processes, and Single Source of Truth (SSoT)) and what is currently live/most up to date according to the related Spoke model?



If you want to make it so the user can't edit, you just set up a hidden value (they can edit/update) and point the end point and you are good to go.






It is an idea but from the other hand this way you generate more traffic between models what impacts perfromance. I wouldn't go that way unless users really pushing to. Having the processes scheduled during night prevents performance downgrades since noone is usually using the models that time.

Also if you have the setup you've proposed you should also take into account that it may cause chaos if they will try to reconcilie data between models but someone did not run the process yet. From supporting perspective it may be quite messy.


I was referring to Meta data only (top level summaries, or limited lists, eg. instead of the million+ item list, the next level up or 1 cell) and giving the end user the ability to run the light process.

Alternatively, the user may not be able to update from the Hub (as it is meta data that is pulled automatically in lock step with the rest of the model), and only pull the meta data from the Spoke model so they know how many changes are coming. This should be significantly lighter load as it is as small as 1 line item with 1 cell or at least under 100 cells total.

@rob_marshall , Great article with good examples. Have you come across a scenario, where as per your example loading the details in DAT01 Transnational Details and then load data in DAT02 Transnational Data. However, source system drops the one of the unique combination all together and respective data is removed from the database (hard delete). Basically, the data is create on day one of the month and day two the record is removed. Therefore, now the data hub and spoke model has the value for the unique combination and the values does not gets updated from source as the unique combination does not exist anymore.  

@Asslam ,


If I am understanding you correctly, I have not come across that scenario where a piece of the data (unique row) is gone/dropped from one day to the next.  Just to be clear, in the above, DAT01 and DAT02 are using the same list, but the modules are dimensionalized differently as DAT02 uses Time and DAT01 does not.


Are you saying in your example, SKU or Product is no longer needed from one day to the next?




Are you using ALM? If you push an update and don't have production flagged correctly that could be the cause.

@rob_marshall , what a great article! We've already tested a few of your recommendations and are already seeing significant performance gains - so thank you!  


Currently, we have about 15 flat transactional modules in our Data Hub which we load monthly however they don't all share the same dimensions.  So, in order to implement your approach above we will need to create 15 unique lists (one for each load) and 15 Properties modules? I was wondering if there is a more efficient way to go about this?





Actually, having 15 different unique lists is not a bad thing as long as you set then up correctly as well as the corresponding "Properties" modules.  By doing this, this can and will help with potential validation issues as now you can figure out which source needs fixing.

If you would like to get on a call, DM me and we can talk further about this.

Hope this helps,




@rob_marshall , thanks for your message,  Yes, SKU or Product not needed from day 1 to next , might be needed for future load but not for a particular month. We had come across a scenario where the record was cleared out on day 2. Basically, the whole record is dropped from the source application. Which means on Day 1 we have a value for Jan for combination of  SKU or Product  and on Day 2 the value is not zeroed but rather no records comes through for the  SKU or Product combination. Therefore, now the Jan values does not gets updated to zero. We figured out a solution using a data variation list ( two list items current load / previous load) in the data module to load two version of the data and clearing out all item in the previous load (second load) to manage this scenario, Still following the good way with couple of extra load actions.


Thanks again this article really helped. 

@Asslam ,


So, again, if I am understanding you correctly, this is one of the main tenants of the article - load the data into a transcational module (dimensionalized by Time) instead of deleting the list and reloading it and adding Time as part of the code/key.  In your case, if you have data for Jan 1, but not Jan 2, then nothing should be loaded into Jan 2 and thus it would be zero.  Please correct me if I am wrong, but I think the issue above is about the deletion and/or clearing of a module.  If so, you can change the clearing of data within the action definition.


Please let me know if this makes sense,



@rob_marshall - This article is very insightful. Thank you for putting this out here on the community. I'm curious if you're doing any exception reporting and data validation in your models using the Worked Method above? I'm assuming that any of the data exception reporting would be occurring in DAT01 Transactional Details?




Absolutely, and that is a great point.  In an ideal world, you should be doing all validations in the Data Hub and within those modules that actually warrant it, so definitely in DATA01.  In theory, you only want validated data to be imported into your models, so doing all of the data validation in the data hub is the correct method.  If the data has issues or needs a further review, then that data should not be imported to your downstream model until it has been verified.

Hope this helps,


Hi @rob_marshall !

Thank you for such a great article, we follow most of the principles as well.

One thing which always comes to our discussion is how we ensure totals in data hub are the same as totals in end models. We would call it a  reconciliation process. 

We would normally set up a responsible person who would check totals in both places, or we would set up an auto notification which will report if totals are different.

Do you have any recommendations in that area, how do you handle that?



I'm sure @rob_marshall has some best practices around that. One idea you might consider is using the NewUX since it allows you to set up pages that can be pointing to different models (and in different workspaces). I believe there is an idea out there to allow multiple models on the same page.

Anyway, one idea is to have your import process bring over the totals (not the transactions) to the destination module for comparison. Super easy technique and fast.

@AleksandraE ,


I agree with @JaredDolich in that if you need to do this reconciliation piece, then just pull over the totals from the Data Hub to the spoke model and then do a comparison.  Really, there shouldn't be a need to bring the data in at the lowest granularity just to a "checksum".  Also, since there shouldn't be hierarchies in the data hub, it should be a straight total of your transactions or accounts for the current month or period.  No need to bring in previous totals if they aren't changing.


Hope this helps,



@rob_marshall @JaredDolich  thank you both for the prompt feedback.  we follow the same approach. the only concern is that if you have 10 sources , you have to maintain 10 extra high level recon imports to make sure the other 10 low level imports are working properly. so the kind of concern is - how do i make sure my recon imports are also working correctly. i hope you understand what i mean !

and i also do not see any other option at this point of time.




@AleksandraE ,

Not knowing exactly what you have, but could you have a list with the Data Sources and then have a module by Time (monthly or quarterly or yearly) where the high level transaction total from the data hub gets loaded to?  From the Data Hub side, you could have the same Data Source list where you sum the data into a module, create a view, then use that view for the import to spoke/target model.


hi @rob_marshall !

in my example i load data from 10 end models to data hub, consolidate data in hub and send to other system.

if i want to include recon process in this flow ,  i have to set up 30 imports:

- 10 imports from each end model to list on low level 

- 10 imports from each end model to module on low level 

- 10 imports from each end model on high level for recon directly to recon module

i will always have to set up an additional recon import for each source which creates an extra opportunity for errors.

but clearly there is no much other options inside anaplan.

hope this clarifies what i mean 🙂





It is very possible I am missing something and if you would like to talk about it, please send me a email ( to work this out.  What I was trying to convey in my previous post:

  • In the data hub, create a module by Time and create 10 line items, one for each source.  Create a formula for each one to pull the totals.
  • Create a view of this data and use this as an import to the spoke model.  This way, it will be one action for all data sources.
  • On the spoke model side, have a list of all 10 data sources.  Create a view of this Data Sources list by Time and import the data from the data hub to this module.

Again, please reach out if you would like and we can jump on a quick call.



@rob_marshall  thanks for the detailed article


What would be the best practice on using filtered view from data hub vs creating subsets in the spoke models?

I've seen both but curious what is the recommended usage for subsets.





I am bit confused by your question, but let me try and answer it.  In the data hub, really you shouldn't have subsets because you can use filtered views as the source for spoke models.  With that said, using a properties module in the data hub to define the subsets in the spoke model is absolutely ok.  Remember, the data hub should be as basic as possible (flat lists, little to no analytical modules, no hierarchies, etc.) but that doesn't mean it can't supply that information to the spoke models via filters from "property" or "attribute" modules.


Does that help?





thanks for the answer.  my questions was geared towards which option to choose between creating a subsets in the spoke model vs use filtered view from a properties module...



Yea, I totally misunderstood your question and to be honest, I am still not understanding the question with 100% certainty.  I am thinking the question is creating a subset vs. using a boolean in a properties module, is that correct?  If so, it depends.  Great answer, huh?  But it depends on what you are using it for.  A subset is great to get totals and to render list members for a selected number of list members.  You can do the same with a boolean in a properties module, but the total could be off because it is only filtering the list members, not the overall total.  With that said, it you need to check to see if a member is in a subset for a formula, it is best to use a properties module.  


Here is a good trick, so you have the best of both worlds.  Many folks will do a finditem() to see if a member is within the subset on the master list properties module (think Employee is the master list and i have a subset of Active Employee's).  On the Employees properties module, create a line item formatted as a boolean with it hardcoded to True.  Then, in the Applies To for that item, change it to the subset.  With this done, you can use this line item for formulas and the subset for reporting (dashboard or apps).


Hopefully this was the question you asked.  If not, we can try again.



Hi @rob_marshall , got a question on transaction details.

I have several attributes in Dat01 (sku, store, account etc..). What would be the best way to dimensionalize Dat02 data in the spoke model with these attributes?




I am not exactly sure I understand your question.  If you DAT01, diminensionalized by transactional list (concatenated code of your attributes and Time) which holds your data that changes over time, and another module DAT02 which is also dimensionalized by your transactional list (but this time without Time), you can go just about anywhere from here.  Just because you have this data at this level in your Hub model, doesn't mean you have to bring it over at the exact same. granularity to your Spoke model.  


So, it is hard for me to understand exactly what you need in your spoke model.  If you need a module which is dimensionalized by Account and  SKU by time, you can do that from DAT01 and DAT02 (see the "Good Way" from above).


Hope this helps, if not, we can give it another go.




I have my properties module defining transactional key/list + other attributes as below.


and the data module dimensionalized with the same transaction list and time.


So If I want to bring in the data to spoke model dimensionalized by Grouping2, do i need to setup a hierarchy in the spoke model with transactional list and Groupping2? 

thanks for help.



Ahh...No, you don't...Let's walk through two different scenarios.


Scenario 1: My spoke module has Grouping 2 by Time.  The great thing about Anaplan and the import is it can aggregate upon import.  So, in this case, in the DAT01 (the one with time), you would create a lineitem named Grouping2, changes the Applies To to remove Time - basically setting Time to Not Applicable.  This creates a subsidiary view, but it is needed.



The formula would point to your DAT02 (the properties module without Time) to get the correct Groupings.  Now you can create a view.  In DAT01, create a view with the transactional list and Time in the rows, with your line items on the column axis.





In the spoke model, understand what you want your module to be (let's say SKU, Grouping, and Time).  In the spoke model, go into the target module and map the Groupings2 line item (from the import) to the Groupings2 list (in the import), Time to Time, and the value or monies to the line item in the target module.   Again, the import will automatically aggregate on the fly.


Scenario 2:

In the spoke model, I have a hierarchy with SKU rolling up to Location and I would like to show this in a module, by Time and Groupings2.  Since SKU you most likely will have the same SKU's across multiple locations, the SKU level will need to be a numbered list.  Now, it depends on your naming convention, I would use the code of Location concatenated with the code of the SKU with a delimiter between them.  This logic should be in DAT02 in the Hub and you would follow the same exercise in Scenario 1, but this time, instead of this line item being list formatted, it should be the code of the hierarchy.  Prior to the import to the spoke model, you will need to create the hierarchy in the spoke model from views (L1 Location and L2 SKU) from DAT02 in the Hub.  Create a module with the hierarchy, groupings2 list, and Time defined with a line item named value.  Then import the data.


I hope that clears things up.  If not, please DM me and we can get this solved.






I had a chance to try it out today and worked really nice. thanks for this detailed explanation.



Good deal!

Hi Rob, 


just read through your interesting article above. So, in case of two available workspaces and application of 1 data hub, what would you recommend as best practice in terms of performance: to put the data hub on one workspace and the working models on the other? Given that SoD compliance is already handled on very detailed level here.


Thx for your opinion. 






Yes, put the data hub in it's own workspace and the spoke models in a different workspace.  This will help with segregation of duties but also when loading the data hub with large transactional lists.

Great article @rob_marshall! I'm currently working through our models to rebuild with best practices, including the Data Hub because we currently don't use one at all (EEK! I know). How would you handle the following scenario? 


We have a list of bonds that is manually tracked by another department in Excel. They provide us the spreadsheet and it is uploaded to our budget model once a year for budget planning. As I rebuild, the user would like to add items to the bond list manually on a dashboard moving forward along with the associated data, as there are only a handful added each year, and formatting the provided file for upload is problematic. 


Moving forward, I want the bond list to be preserved in the Hub after new items are added and the budget is published, but it must also be available in the budget spoke model in real time for budget planning. Where would you build the "add new list item" functionality?


If it is in the Budget model, how would we send it backwards into the Hub for preservation since that is not encouraged, but there is no other source to pull it from? I'm concerned about building it into the Hub directly because users shouldn't be encouraged to go in there to do stuff, and an extra step would be needed to send each update back to the budget model for real-time analysis.



Good morning and thank you for the kind words.  As with everything, rules or best practices don't work 100% of the time and this might be one of those cases.  If the bonds are not coming from a true source and the Budget spoke model is in effect the "system of record" for Bonds, then I would have the users add the bonds in the spoke model and then have an action populating the "Bond Flat" list in the Data Hub.  You are correct in that users really shouldn't be in the Data Hub and also correct in that the data should flow in one direction (left to right or Hub to spoke), but this might be a case where that rule should be broken.


One thing to be careful about, please make sure the Bonds have a unique code.  This is where having users "master" the data becomes tricky and why a true source system would be best.



Thanks @rob_marshall ! I appreciate your insight, I do agree that I have found one of the rare cases where data will need to flow backwards to the Hub, until the department can be persuaded to move their data to a more permanent system. And thank you for the reminder about the unique code! I may try to use the spoke model's numbered list unique code to my advantage with NAME(ITEM('Bond#')) in the export module perhaps? Good food for thought. 



Please don't use the unique code of the numbered list as that number means nothing and if the list is Production Data list and you create a copy of the model, those ID's get reset.  Instead, try to figure out something else to make it unique.



Version history
Last update:
4 weeks ago
Updated by:
About the Author