Anaplan data integration

Hi Team,

We have integrated SAP with Anaplan and data flows are scheduled from SAP to Anaplan data hub. However we keep getting memory warnings for workspace as i am sure this is to do with workspace memory. From an architecture standpoint how does Anaplan store data in datahub ? when we are sending 2-3 MB of data from SAP how does it multiply within datahub? Are there ways to optimize the dimensions of the datahub? 

Also from a training perspective for Anaplan customers what is the best training to attend from a technical architecture ,data hub and modelling standpoint . When i log into the training academy i see several training like model builder 1, model builder 2 etc. Would appreciate any guidance on what is the best training to attend to have a handle on data  hub and architecture to make sure future rollouts are done in an optimal way without more memory consumption?

Thank you

Srivaths

Answers

  • @Srivaths123 

     

    Can you tell me what is the size of your datahub model? 

    Saibharadwaj_0-1597108346253.png

    Above screenshot helps in understanding the size of model

     

    There will be several ways to optimize it but seeing the model helps (whether it is done by following best practices?) or provide some additional details where you want to specifically optimize (Identify the list/ module taking huge size and then show data linked to module or list then i can give better input)

     

    For training if it's customer then i would say start from Anaplan Essentials for Customers then go to L1, L2 MB like that

     

    For Datahub the training is provided in Level 2 MB -> Entire sprint is dedicated to Datahub you can refer it.

     

    I've recently got this Best Practice article on Data Hub which i haven't read (https://community.anaplan.com/t5/Best-Practices/Data-Hubs-Purpose-and-Peak-Performance/ta-p/48866). Please have a look at it

     

    Thanks & Regards,

    Sai Bharadwaj

    linkedin.com/in/sai-bharadwaj

     

     

    V.Sai Bharadwaj

    Connect on LinkedIn

  • Thank you so much for your inputs .

    The data hub size is  around 115gb per workspace .

  • Oh man! It seems like quite a big project. How many such workspaces are present?

    V.Sai Bharadwaj

    Connect on LinkedIn

  • @Srivaths123 

     

    Oops! That’s way too big. I hope your data hub is not built multi dimensional.How many data hubs do you have when you say data hub per workspace??

     

  • HI,

    We have multiple workspaces , one for prod and one for dev .. so currently the prod workspace memory is at max memory and showing memory warnings .

     

    Thank you
    Srivaths

  • @Srivaths123 

     

    It may be worth to look at. Honestly I haven't seen Data Hubs going beyond 30-40GB mark. But I trust there must be huge amount of transactional data stored in your Data Hub.

     

     

  • Thanks you Misbah. We have master data as well as transaction data coming from Source .

     

    Thank you
    Srivaths

  • @Srivaths123 You may want to look at ways to optimize the Data Hub.

    Few things to consider:

    1) How much historical data needs to planning purpose? Identify the Modules and use Time Range feature in Model Calendar to reduce the transaction data module size.

    2) Look at the possibilities to check whether the modules, having max cells, can be limited using List Subsets.

    3) Look at the possibilities to create Lists based on transaction data and use those lists in the modules to reduce sparsity.

     

    Regards

    Ashish

     

  • Thank you Ashish.

    How does concurrent job load into the data hub during integration? Is it an issue if concurrent jobs load into the data hub ( same table or different table).. how does it behave ? We have several objects integrating into Anaplan data hub and i want to make sure i schedule them right , as some jobs can be concurrent is it an issue ? or is Anaplan capable of handling concurrent jobs into data hub model ?

     

    Appreciate your inputs.

     

    Thank you

    Srivaths

  • @Srivaths123 Performance of the data load depends upon how optimal is your Data Hub Model design. I would highly recommend to go over the best practices for designing Data Hub: https://community.anaplan.com/t5/Best-Practices/Data-Hubs-Purpose-and-Peak-Performance/ta-p/48866

     

    Generally I would design my Data hub load following best practice scheduling jobs to run at off-peak hours, split large volume of data in smaller chunk and load data in sequence. This has given me better performance for data loads.

    Another practice which you can follow is to load only incremental data which would reduce the volume of data load processing providing better performance.

     

    Regards

    Ashish

  • @Srivaths123 We are also thinking to connect SAP to Anaplan. How did you do so? A middleware or direct JDBC connection is what you used for the same? Thanks in advance.