Ways to identify duplicates for list import other than RANK or ISFIRSTOCCURRENCE?

tharuns98 · January 2022

Just wondering is there a way to find duplicates where the cell count is more than 50 million and I know in Polaris engine it is possible. Any other way to do it in the classic engine ?

M.Kierepka · January 2022

I'd say the problem is you need to frequently import more than 50 million items to a list. Consider rethinking your processes, and if it's always necessary to import so many items — it's worth checking the source of this data, if maybe you can do something that will filter out duplicates there.
For an infrequent one (especially if it's admin-only), if it doesn't cause incorrect results, I'd just not worry about this warning.
If you really need a solution in Anaplan, the only solution I can think of is to add subsets to this list based on something (like first char in/length of the name of an item). That should allow you to split the items into buckets, each having less than 50m items, and then use ISFIRSTOCCURRENCE().

M.Kierepka · January 2022

It depends on the situation, format of analyzed value and what type of result you need. I.e. If you just want to know if there are some duplicates in format of some list items, and which items are duplicated, then you can just put something like dummy formula"1" and later [SUM:] by list(s) you expect multiple values for. If in the result you will get 1 for some item, it means no duplicates, if you get more than 1 then you see even how many (and 0 means no occurrence).

tharuns98 · January 2022

what can be done for list creation purposes to avoid the warnings when we import it?

Ways to identify duplicates for list import other than RANK or ISFIRSTOCCURRENCE?

Best Answer

Answers

Categories