Hi
1. Why do we create a concatenated list to load transactional load into data hub? Instead we can create individual lists as dimensions and load data.
2. Why do we create hierarchy lists as flat lists in Data Hub?
Thanks
Ibrahim
@1454884
Excellent question. I'll answer these directly but first I HIGHLY recommend you read this data hub best practice article first by @rob_marshall . In this article @rob_marshall answers both your questions and a lot more that you're likely to be considering as you pursue data hub processes. You might also consider reading @DavidSmith article on the truth about sparsity. Even if you have to dimensionalize time with which may cause a bunch of empty cells, you're performance AND space will likely be smaller than if you don't dimensionalize time.
Hope this helps. Try to read @rob_marshall and @DavidSmith articles. I promise you'll walk away with a completely new perspective.
Firstly, @JaredDolich has provide a first rate response but Im sensing that the explanation may be a little to long and somewhat too technical.
The primary reason comes from the nature of multidimensional data.
Imagine for every data point in a list that point is duplicated by the number of list items in any list which is added as a dimension. Therefore, a module can become massive in size very quickly as each dimension added increases the number of possible combinations by a factor equal to the number of items in the new dimension.
The problem with this is that for the vast majority of possible combinations of data points across multiple dimensions data will never exist. Unlike other applications Anaplan will retain this in memory even if no data is added.
This is what we mean when we refer to sparsity. There is too many spare data reference points. To minimise sparsity we create a unique reference for each actual combination and create a list only for those combinations of dimensions that will ever contain any data. We achieve this by concatenating all the relevant codes to create a unique master code for that single reference. We compliment this by creating list properties which are themselves format as the required lists and populated this if the individual list items for those dimensions.
Hierarchies are not created in the data hub as we do not run any modelling in the hub so the functionality affording to us from their use is redundant. Hierarchies consume more memory than flat lists due to the need to aggregate up them and as they are not required they should not be created.
I hope this adds to @JaredDolich explanation and provides you with more context.
Hello @tflenker and @abhaymalhotra This post directly answers the question we raised about concatenating the key for transactional data in the DH.Also, read @rob_marshall article about the datahub (particularly the sections under:
Hello team, I am logging into Anaplan through SSO. I would like to create a batch script to import a file into Anaplan. For the Spoke model, I would like to run an process via a batch script to import data from the Data Hub. Can anyone provide me script for both the condition.
is there a way to import data with months in columns (usual excel table view)? I have to transform the data so there are rows to cover each period in the upload data form, but I am wondering if there is a better way to just import the data as is with the months in the columns. please let me know what you think. thank you,
Hello, I am receiving an 'Anaplan Upload Failed' Status Description when testing my integration with a Big Query dataset. The integration imports data from BQ to our Anaplan model. No other details given in the error log. I suspect that Cloudworks is not even picking up the file but am not sure what we did wrong on the set…