New Line Item Text Format Type - SmallText

First, I would like to address that I do not know how the back end of Anaplan is built, so I am unsure as to the possibility or ease of implementation of this idea.

The idea is taking a concept from SQL Server. SQL Server has the following data types for storing integer values: bigint, int, smallint, and tinyint. The below chart shows the range of values and storage requirement for each data type.

int data type sql.png

The concept of this idea would be the same. Make a new line item text format, SmallText, that would limit the input of text to 60 characters to reduce the memory/storage requirements and increase model performance when used versus the traditional text format. Anaplan already has this text format requirement for list member codes (codes cannot exceed 60 characters), so it would seem that the Anaplan platform code for this item, that is already in production, could be leveraged to apply to the new SmallText line format type. I am sure that is an oversimplification of the engineering required, but hopefully, the idea is clear.

While minimizing text formatted line items and their calculations is a best practice in Anaplan model building, some models have the need for using smaller text strings at scale.

Assumptions

Limiting the amount of characters a text formatted line item can accept will have a direct impact on the memory/storage requirements of that text formatted line item thus increasing model performance.

Benefits

Performance increase across entire platform for all models where line items can be converted from Text to SmallText.

Faster models
Quicker save state times
Decreased model sizes
Etc.

Comments

jprince

I've had requests on limiting the amount of text end users can enter into Anaplan; this can be a useful tool for some of our customers.

ben_speight

We currently allocate 8 bytes of workspace allocation per text-formatted cell to hold a reference, and employ techniques like deduplication (multiple appearances of "ABC" in a detail line item can share the same representation as it is immutable) and reuse (eg "A" + "" or UPPER("A") -> the same "A" supplied to it) internally to keep the memory used by the representations themselves to a reasonable level. Just adding limits would help Anaplan's resource management a bit, and prevent very long text cells from causing model performance issues, but if all text was already within bounds would not change the performance characteristics. We would need to think about how to deal with text that exceeds the limit. For detail data, either silent truncation or a data input / import error could result. If we introduce/change the limit on an existing line item, would we first check that all values were within limits? For calculated data, it would be difficult to do anything other than silently truncate values that exceed bounds - and users would not be able to discern whether or not this had happened unless we added additional support.
If we went further (and I have understood the proposal correctly), we could technically store the representation inline instead of indirectly referencing them. This would help resource allocation, but instead of 8 bytes per cell Anaplan would need to count 2 * (1 + max length) bytes per cell, as we could not achieve the optimisations mentioned earlier. Some things like calculations could get an improvement in performance, but every kind of text calculation/function would need to be re-implemented to get that improvement, and conversions between non-limited and limited text could easily negate it.
Supporting lower-precision/range numeric data would require much less work and add more value.
However, if the required behaviour can be pinned down, there are benefits in just imposing a limit - for example, where data will be fed into external systems that themselves impose similar limits.

MarkWarren

From my analysis of models across the platform most text is below 60 characters anyway.
Now I can't measure the length of every cell, so this is an average of text lengths in cells.
But in most models it is the volume of text rather than the length that is the problem, so we, as a team and as a community, need to look at strategies for avoiding using text, showing ways to model that don't require it.

erobbins

@ben_speight, thank you for the thought out response from the engineering perspective. This proposal assumes that text is already within the current bounds, so I would like to steer the discussion toward your idea of storing inline instead of indirectly referencing. If this idea (and any others like it) has the opportunity to increase calculation performance, it would be good to know to what extent for models with low, medium, and high amounts of text and calculation of text line items.

Even if the text calculations/functions would need to be re-implemented to get the improvement, I think it potentially would be worth it given how costly text is to models at scale. For all of the use cases that I can think of, I would not see a need to convert between limited and non-limited text thus locking in the performance improvements.

Quick Links

New ideas

Additional Insight Card Open Link not working in Dashboard
Cannot click on the title to open the Card, but in a normal worksheet, I can click on the title to open the card in the bottom window. In Dashboard, additional insight user has to click on the box icon on the left of the title, which is different from a normal worksheet. Please standardize the user experience if possible.
Chart Signage
In some cases, we want to reverse the signage of the chart so that if we have costs increasing in P&L, we want the chart to go up rather than down. Creating additional modules for this with the signage flipped creates additional overhead and space for these one off dashboards. It would be good if we could either change the…
Sandbox & Extended Learning Resources for learners
As Anaplan continues to evolve with advanced capabilities especially with recent AI driven innovations like CoModeler, Agent Studio and planning agents there is a strong opportunity to further enhance the learner ecosystem alongside the platform's innovations. One idea would be to introduce sandbox environment, along with…