keeping last duplicate in numbered list

Highlighted
Super Contributor

keeping last duplicate in numbered list

hi community,

transactions is a numbered list and module has 3 line items, 2 are text and 1 is number formatted

Kanishq17_0-1591285714262.png

so the ask is to find duplicate combination of customer, prod and value in the list and in case of match , keep the last item and delete the first item 

Kanishq17_1-1591285880012.png

here, 1 and 4 are duplicate , 3 and 6 are duplicate , but we need to keep 4 and 6 and delete 1 and 3 respectively,

 

any help will be appreciated

9 REPLIES 9
Highlighted
Certified Master Anaplanner

Re: keeping last duplicate in numbered list

Guessing the answer would be no, its always worth asking if we could bring the data into Anaplan the way that we want it. 

 

If not, then I think there are a couple of ways to achieve this. Here is one of those examples:

 

in your 'transaction det' module I would create a line that concatenates the cust, prod, and value. Then we will create a new list and import the first occurrence of each into that list. Then we want to get a line item that has our new list. In my example I am doing this in my Concat List line item:

jasonblinn_0-1591287342907.png

jasonblinn_1-1591287374789.png

 

From there we need to find the maximum number of occurrences that each unique concatenation has. 

 

I have done this in 2 steps:

Cumulate: CUMULATE(1, FALSE, Numbered List)  This formula is just counting the total number of items within the list and counting from 1 to however many items you have. 

jasonblinn_2-1591287630727.png

And secondly, we find the max number of the cumulate against our concatenated list dimension that we created earlier. 

jasonblinn_3-1591287700217.png

 

Now that we know the Max of each unique concatenation, then we can compare the cumulate line to the max line to determine what we should delete.

jasonblinn_4-1591287762796.png

There may be a more efficient solution out there in terms of calculations/being less process heavy, but the is one option that would get the job done. 

 

Jason

 

 

 

 

 

Highlighted
Super Contributor

Re: keeping last duplicate in numbered list

hi @jasonblinn ,

 

thanks for the solution

just one thing , so you created a separate list of all the unique concatenations and in the main module, you found max position of each unique concatenation , will these work in case of if here are more than 2 duplicates?

 

 

Highlighted
Certified Master Anaplanner

Re: keeping last duplicate in numbered list

Yes, since the cumulate will continually count (Up to 50 million) then when there are more than two transactions that have the same concatenation then it would remove all except the last one.

 

Using the same example, I just added lines 8-11 and copied the cust, prod and value from #1. As you can see, all of the transactions that match this data except for #11 have been flagged for delete (#1, #4, #8, #9, #10):

 

jasonblinn_0-1591292660320.png

 

Jason

 

Highlighted
Super Contributor

Re: keeping last duplicate in numbered list

hi @jasonblinn 

 

This is great,

 

Thanks a lot

 

 

Highlighted
Certified Master Anaplanner

Re: keeping last duplicate in numbered list

HI @Kanishq17 @jasonblinn 

 

Below is the quick solution for the same

 

Created the module same Transaction Det

 

NakulAggarwal_0-1591289894454.png

Additionally Created Three line item :-

 

First Line item to concatenate the data for which we want to make the check.

 

Second Boolean to check if it is the First Occurrence in the Data

 

Third Line item Not First Occurrence which you can use to delete the numbered list.

 

Below is the Blueprint of the same.

 

NakulAggarwal_1-1591290034397.png

Hope this resolves your issue

 

Thanks,

N

 

Thanks,
Nakul
Highlighted
Certified Master Anaplanner

Re: keeping last duplicate in numbered list

@NakulAggarwal 

 

I thought about doing the same, but I got stuck on the point that there could probably be more than 2 occurrences of the same concatenation. If there were 3 occurences then we would want to delete the first 2, and by using the first occurrence then we would only delete 1. At least this was my assumption. If we can assume that we are adding one record each time the import is run, then I agree that your solution is much faster and easier!! If it is possible that more than one record gets imported with multiple instances of the same concatenation then the delete action would need to get ran an undefined number of times to clear all of the not first occurrences until we get to where the is only one of each remaining, or you would need to use a different solution that could clear them all at once.

 

Jason

Highlighted
Super Contributor

Re: keeping last duplicate in numbered list

hi @NakulAggarwal 

 

so this is  giving the latest duplicate to be deleted i.e 4 and 6, but we want 1 and 3 to be deleted, this was the first thing i thought of , 

 

thanks 

Highlighted
Master Anaplanner/Community Boss

Re: keeping last duplicate in numbered list

HI @Kanishq17,

Very interesting post and would request you to provide what makes the difference of deleting the duplicates either first or last or in between from a business standpoint. Are we capturing the date and timestamp? I am just curious to know! Could you help?

Here's the solution that I have without creating a list. 

Create 3 line items. I have separated the logic for your understanding. You can shrink it based on your convenience as you might face size constrain because of the transaction list.

image.png

 Result:

image.png

 So with the help of Delete? you can achieve your result. Note: Rank function has a cell limit of 50 million.

I hope this solves your request!

Regards,
Kavin.
Highlighted
Super Contributor

Re: keeping last duplicate in numbered list

hi @kavinkumar 

 

Thanks for the solution,

Actually  it was given to us as a poc to do and in actual requirement, it was on base of these properties and date of loading the transactions, so yes date of loading is recorded and in case multiple duplicates are loaded in that same date, we had to to try to keep the last added item and delete early entries, not on time but on date basis