OEG Best Practice: Inside the Hyperblock

AnaplanOEG · March 2022

We recommend this article to Experienced Model Builders and Solutions architects.

What are blocks?

The Hyperblock is an in-memory calculation engine that can index and understand the dependencies between the model objects and calculations, based on the connections they share. When a user enters or changes a value, the Hyperblock understands which calculations need to be updated and in what sequence. In addition, the Hyperblock can perform calculations in parallel, meaning large, complex calculations can be performed within seconds.

The Hyperblock is built up from individual blocks, each of which contains the cells for one combination of custom lists, time periods, and/or versions. Each block can contain up to 2,147,483,647 (2^31 -1) cells at the most detailed level.
Cell count limit on line item blocks - Anaplan Technical Documentation.

How blocks are calculated – General Lists

Within the Hyperblock all calculations are done in blocks, at the line item level. Think of blocks as levels in a hierarchy where all detailed (lowest level) list members form one block, while summary/aggregated levels form their own block, individually.

For example, if you have a line item dimensionalized by a list consisting of seven levels, all members at the most detailed level (L7) form a single block, all members in L6 form their own individual block, all the members in L5 form their own individual block, and so on up to the very top level member.

Level 7 block: Total blocks = 1

Level 6 block: Total blocks = 131

Level 2 block: Total blocks = 9

As you go up the hierarchy, you can count the total number of blocks created by the line item.

When you combine multiple lists together in a line item (the Applies To is L7 Cost Center, L4 Geographies), it works in the exact same manner where the detailed members of both L7 Cost Center and L4 Geographies create one block, and then every aggregated member in each list creates its own block.

How blocks are calculated – Native Time/Native Versions

Native Time and Native Versions are different than custom lists in that every detailed member is a block (Actual, Forecast, Budget, 2021 Jan, 2022 Feb, etc.) as well as the aggregated members (Quarter, Half Year, FY Year, All Periods). One of the reasons for this is to avoid circular references with time-based calculations as well as having Opening and Closing balances.

Also, this allows the use of certain time functions (NEXT, PREVIOUS, LAG, LEAD, OFFSET) as well as certain Version functions (NEXTVERSION, PREVIOUSVERSION).

To calculate the number of blocks each member in the timescale must be accounted for. For example, let’s say we have the following module:

Timescale: 3 years at the Week level
Versions: 4
List: L7 with 631 total members, but only 406 at the detailed level
Line items: 20, with summary turned on

Let’s break this down by lists:

Time: 214 blocks by multiplying 71*3 + 1 (All Periods)

Versions: 4 versions = 4 blocks
L7:

This one module will have 193,456 blocks per line item. If you have 20 line items with the same dimensionality, you will have created 3,869,120 blocks.

If we use the same dimensionality, but use a custom list to represent Time, the number of blocks will decrease. When using a custom Time list, every detailed member, in this case at the week level, is contained in a single block.

Again, using the same dimensionality but substituting a custom Versions (more information on custom versions) list versus Native Versions and using Native Time, our blocks will decrease even more.

And finally, if we use a custom list for both versions and time, our total blocks will decrease even more, but with that, our blocks will be very large because we are storing the exact same amount of data.

So, is having fewer blocks always the goal which will correlate to a more performant model?

Hyperblock performance

The Hyperblock can perform calculations in parallel at the block level; it will calculate its blocks with Hyper-threading to create as many parallel calculations as possible based on cell count.

Note: Exceptions are the functions running single-threaded: RANK, RANKCUMULATE, ISFIRSTOCCURENCE, and CUMULATE with 3 parameters (using a list).

The more cells a block has, the more threading will be applied up to the physical limitations of the CPU. If blocks have very complex or large calculations, they will cause those threads to do more work and therefore take longer. Meaning if we create blocks with too high a cell count and a complex formula the performance may be affected – however, the Hyperblock takes care of this in most scenarios by doing a selective calculation on only the cells affected by the change instead of all the cells in the line item (and so on through the DAG - Directed Acyclic Graph).

The threading is based on cell count, blocks with a very small cell count will not have as many threads applied to them. Therefore, a block with a single cell, such as a top-level summary on a single dimension, would have no threading applied. If that cell must do a lot of work summarising values over a large dimension, it could take a long time to complete (because no threading is applied). This is why we have the Planual rule for 1.05-07 Avoid Top Level for large flat lists.

A general rule with Anaplan is that the duration of a calculation is proportional to the number of blocks. It simply means that the more work we have, the longer it takes. Within those blocks, some will take a lot longer to calculate than others (the larger blocks); but on the whole, when averaged out, this rule applies. This also applies to how complex or inter-connected the model is; the greater number of connections among line items means a longer time spent working through the DAG (the method Anaplan uses to determine a calculation chain).

Summary block counts

Line item summaries can dramatically increase the cell count of a line item and the block count.

Adding a summary on a line item can increase the work needed to be done by the Hyperblock and leads us to Planual rule 2.03-01 Turn Summary options off by default

Here is a quick example to illustrate why adding a summary generates so many blocks.

A module is dimensioned by Account and City lists and Versions. The lowest level of City and Account form the biggest block, and there are separate blocks for the aggregated levels at Country, Region, and Channel for the two hierarchies and finally the two top levels of the hierarchies. As seen in this example, having a summary method applied increases the block count 12-fold for each native version defined.

There are 11 summary blocks for the main Account/City block, when we add a version we duplicate that many blocks for each version; doubling the number of summary blocks.

A more detailed example of the impact of Versions on block count, many of which will be summary blocks, comes from the article To Version or Not to Version?

Should I worry about blocks?

No, the Hyperblock will handle calculations efficiently in most scenarios and create an efficient size and number of blocks. The building of a model should follow the PLANS methodology, with a lot of emphasis on the N, Necessary. Try to calculate as little as possible to achieve the desired outcome. To do this you will need to use S in DISCO to create reusable System modules to avoid repeated calculations and potentially duplicated blocks. Simplify calculations so each block calculates quicker.

We often suggest splitting out complex calculations into new line items to reduce complexity. You will quite rightly argue that this creates more blocks and more blocks mean more work to do. This is correct but the benefit of simpler calculations means that extra volume of blocks gets done quicker than fewer more complex blocks. Not all blocks are created equal!

Do worry about summary blocks though and get into a habit of turning the summary off when you create a line item. This way they will only get added when you know they are needed. Try to avoid top levels on lists where possible, by that we mean don’t just add it by default to all lists, consider its use.

A reason why we shouldn’t focus on just the block count

The thing to focus on is how does the calculation perform when the complex formula is split out. If it takes less time, does it matter there are more blocks created? There’s always a balance to be struck between size and performance. As always, test to see if it has a benefit, and test in isolation where possible. Again, for an example see the article To Version or Not to Version?

That said, we should be cautious about the number of blocks or cells that some calculations have to work over, especially calculations that have to execute in a sequential way. POST, CUMULATE, LAG, LEAD, OFFSET are examples of formulas that will operate over a timeline in a specific order having a lot of time periods means a lot of blocks to execute that calculation over.

If summaries are also applied, they additionally execute across that time range after the cells calculate. To keep excess summary blocks in check, try to keep time ranges as small as possible, calculate over fewer time periods or at a higher granularity, avoid unnecessary historical data.

Another aspect to consider is the dimension order, try to keep them in similar orders between source and target. This will align the block indexes in similar orders so that data is read in a predictable sequence when the processor pre-fetches the data into cache. Using the default system applied order of Applies To is the best way to do this. This is shown in more detail in the article Dimension Order - Anaplan Community

The key points are to understand what the calculation is doing and how it operates, how many cells it will calculate over so that you can make the right decision on what is absolutely necessary so that the minimum number of calculations are done. PLANS - This Is How We Model - Anaplan Community.

Focus Points

Avoid unnecessary summaries
- Turn off when creating line items
Reduce summaries
- Smaller time-range
- Fewer Periods
- Flat lists
Use System modules to avoid repetition
- Do calculations once, reference many times
Reduce complexity
- Split out complexity, more but simpler blocks
Be careful with time series functions
- Smaller time-range
- Fewer Periods
- Months not Weeks

Got feedback on this content? Let us know in the comments below.

Author Mark Warren.
Contributing author Rob Marshall.

PiotrWeremczuk · April 2022

Thanks Mark, that's very interesting, especially the part on Time related blocks... I've never thought about it this way but in fact using time functions may lead to some crazy 'daisy-chains' (as explained here: Planual Explained - Day 23 - Anaplan Community). Maybe it's not the most important thing in the article, but for me it's a big watch-out and another disadvantage of using built-in Time dimension.

rob_marshall · April 2022

@PiotrWeremczuk

I would be very careful in not using the built-in Time dimension as that removes all time related functions as well as increases the probability of circular references. Really, the only time you should not use the built-in Time functionality is for reporting where the requirements state they need to see the data differently OR their FY year does not match up with native time. With that said, all calcs still should use native time, and then you use a mapping module to display the results.

Rob

MarkWarren · April 2022

As Rob says, Time functions are very useful and should be used when needed. What we're saying is that be mindful of the number of periods they're used over. Your link of time functions to daisy chaining is incorrect; what that refers to is essentially the duplication of line items, copying the same value between line items in a sequence, instead of them all referring to the same single source.

andrewtye · April 2022

Took me a couple of swings at it...!

And avoiding unnecessary historical time periods... one person's unneccessary is another's critical.

Hayk · April 2022

Thanks for great insights. It's easier to understand and apply some best practices, when you know more about the "engine".

Would like to see more articles like this!

rob_marshall · April 2022

@Hayk

Do you have some examples or topics you would like us to explore? Trying to get some ideas.

Rob

IDNovikov · November 2022

Thank you for the great article! It was super-helpful.

I only wanted to clarify one important thing about multithreading.
As it was written, the engine performs calculations in parallel at the block level. Blocks are formed by line items, versions, time and different levels of hierarchies... Are all blocks calculated one by one or some blocks can be calculated in parallel?

For example, it seems to be logical to calculate all blocks that are formed by different time items (months/years) or different versions in parallel if there are no dependent calculations. But from the article it seems like Anaplan calculates blocks one by one. So there is no multithreading for blocks. Am I right?

Does that mean that if we calculate something on days for 2 years, it might be faster to use custom time scale (for days)? (This looks like the same case that was described for the versions in the following article: link)

I understand that we should use Anaplan time scale because of many purposes but it is more of a theoretical question that bothers me a lot.

Thanks,

Igor

rob_marshall · November 2022

@IDNovikov

Multi-threading happens within the block as well.

IDNovikov · November 2022

@rob_marshall Thank you for your reply!

By this you mean that some blocks (that don't depend on each other) can be calculated in parallel, did I get correct?

Have 1 question more:

Do users from Anaplan user list also form blocks? E.g. if there is user list and no other lists in the line item, user list includes 10 users. How many blocks will be created? 10 or 1?

It is important for those cases when some calculations should be user-specific.

rob_marshall · November 2022

@IDNovikov

Yes, blocks are calculated in parallel. Remember, blocks are defined at the line item level, not the module level. Each line item has blocks associated with it and the number of parallel threads is determined by several things:

Function used...Rank, RankCumulate, Cumulate (using 4 parameters), and IsFirstOccurrence are single-threaded
If the target is less than 10,000 cells, the formula will be single-threaded

And yes, if you are using the Users list in the Applies To for a line item, it will have 1 block since the Users list (currently) does not have a top level member.

Hope this helps.

IDNovikov · November 2022

@rob_marshall , thank you for the explanation!

I still can't figure out one thing, and I would really appreciate it if you could help me deal with it:

If blocks are calculated in parallel then why is there the difference between custom and native versions performance described in To-version-or-not-to-version article?
There were 1053000 blocks for native versions and 21060 for fake versions. @DavidSmith wrote that each big block (created by fake versions) was split into 100 sub tasks but the small block (created by native versions) was split into 2 sub tasks. Thus, there are 2106000 subtasks for native versions and 2106000 for fake ones. So, the number of sub tasks is the same. After reading this research I believed that blocks are calculated one by one. Otherwise, that should not be such a big difference in performance.
What could cause the difference in performance if different blocks can be calculated in parallel?

rob_marshall · November 2022

@IDNovikov

Things to consider:

blocks are defined at the line item level, not the module or model level. If blocks were not calculated in parallel, opening a model would take much longer because the model is doing a Full Calculation (calculating everything from building the lists, building the modules, building the line items, loading in stored data, and kicking off all formulas)
Don't confuse sub-tasks with the number of blocks. The number of sub-tasks is dependent on the cell count of the block.
With Native versions, while the number of blocks is more, the actual size of the block is smaller, thus fewer sub-tasks per block. At the leaf level, the fake versions uses 100 sub-tasks because the block cell count is larger. The native version uses 2 sub-tasks, 50 times.
Sub-tasks are "chunks" of the blocks that can be split out allowing for multiple sub-tasks to process. The larger the cell count, the more sub-tasks. Does that mean it is going to be faster? Not always because there is more to process. Can it be faster? Yes, that is why David, @MarkWarren , and I say it depends. And it is also why people should not fear sparsity (to a certain extent).
David's article is an extreme case showing 50 native versions in order to make a point
The picture in David's article about sub-tasks is one block only, at the leaf level. Not all blocks will have the same number of sub-tasks as the cell count decreases at the aggregate levels.

Hope that helps.

OEG Best Practice: Inside the Hyperblock

What are blocks?

How blocks are calculated – General Lists

How blocks are calculated – Native Time/Native Versions

Hyperblock performance

Summary block counts

Should I worry about blocks?

A reason why we shouldn’t focus on just the block count

Focus Points

Welcome!

Comments

Welcome!

Welcome!

Quick Links

Categories