Ways to identify duplicates for list import other than RANK or ISFIRSTOCCURRENCE?

tharuns98
New Contributor

Ways to identify duplicates for list import other than RANK or ISFIRSTOCCURRENCE?

Just wondering is there a way to find duplicates where the cell count is more than 50 million and I know in Polaris engine it is possible. Any other way to do it in the classic engine ?

 

1 ACCEPTED SOLUTION

Accepted Solutions
M.Kierepka
Certified Master Anaplanner

Re: Ways to identify duplicates for list import other than RANK or ISFIRSTOCCURRENCE?

I'd say the problem is you need to frequently import more than 50 million items to a list. Consider rethinking your processes, and if it's always necessary to import so many items — it's worth checking the source of this data, if maybe you can do something that will filter out duplicates there.
For an infrequent one (especially if it's admin-only), if it doesn't cause incorrect results, I'd just not worry about this warning.
If you really need a solution in Anaplan, the only solution I can think of is to add subsets to this list based on something (like first char in/length of the name of an item). That should allow you to split the items into buckets, each having less than 50m items, and then use ISFIRSTOCCURRENCE().

View solution in original post

3 REPLIES 3
M.Kierepka
Certified Master Anaplanner

Re: Ways to identify duplicates for list import other than RANK or ISFIRSTOCCURRENCE?

It depends on the situation, format of analyzed value and what type of result you need. I.e. If you just want to know if there are some duplicates in format of some list items, and which items are duplicated, then you can just put something like dummy formula"1" and later [SUM:] by list(s) you expect multiple values for. If in the result you will get 1 for some item, it means no duplicates, if you get more than 1 then you see even how many (and 0 means no occurrence).
tharuns98
New Contributor

Re: Ways to identify duplicates for list import other than RANK or ISFIRSTOCCURRENCE?

what can be done for list creation purposes to avoid the warnings when we import it? 

M.Kierepka
Certified Master Anaplanner

Re: Ways to identify duplicates for list import other than RANK or ISFIRSTOCCURRENCE?

I'd say the problem is you need to frequently import more than 50 million items to a list. Consider rethinking your processes, and if it's always necessary to import so many items — it's worth checking the source of this data, if maybe you can do something that will filter out duplicates there.
For an infrequent one (especially if it's admin-only), if it doesn't cause incorrect results, I'd just not worry about this warning.
If you really need a solution in Anaplan, the only solution I can think of is to add subsets to this list based on something (like first char in/length of the name of an item). That should allow you to split the items into buckets, each having less than 50m items, and then use ISFIRSTOCCURRENCE().

View solution in original post