Removing Duplicates with custom conditions

Hi

I have a requirement to delete duplicates with some custom conditions. Consider the Element "occurrence" in below image for your reference, I need a formula to automate this boolean element. Conditions are as follows

1.First two elements with first data "11" has Second Data "0" and "1" , In such a case , we need to check the one with "0".

2. Next three elements have similar conditions , hence we need to check both elements with value "0".

3. Next three elements with "13" has all "0" s in Second Data , hence any two can be checked.

4. All the other elements have unique values for "First Data" and hence all these needs to be unchecked..

 

Assumptions

There is at most one element with a particular value of "First Data' having "second Data" as "1".

 

 

 

 

Screenshot.PNG

Any help will be appreciated

Thanks and Regards

Aakash Sachdeva

Best Answer

Answers

  • Hi Aakash,

     

    Try this,

     

    Step 1: Create a list called "First Data" and populate this list from column 1(First Data) of Module.\

                Create a import action and use it.

    3.png

    Step 2: Create below line items and use the formula given below

    1.png2.png

     

    Let me know if i understood incorrectly.

     

    Regards,

    Vignesh M.

  • @VIGNESH.M 

     

    Let me share my thoughts here.

     

    It might not work when there are more than 3 entries in first data column when the expectation is to take only first two into consideration - your approach will take n-1 entries into consideration for n entries.

     

    Please correct me if I got it wrong

    Thanks,

    Miz

  • @Misbah, the solution will work for more than 3 entries. My understanding is that, there shouldn't be any duplicate following custom condition and it's not like we should consider first 2 entries only. Please check the screen shot provided in the solution.

  • @VIGNESH.M 

     

    I think I was not able to put down my thoughts effectively, hence sharing the screenshot

     

    clipboard_image_0.png

    Mimicked your solution here.

    If you change Old East's (Row 2) value from 11 to 12, see what happens to the Boolean checks - it checks 3 boxes of First data 12 when it should have stopped at 2 of the boxes (Which was the expectation)

     

    Thanks,
    Misbah

     

    Hope 

  • @Misbah , here the request is remove duplicates, it's not like we have to take only first 2 entries data. 

    N-1 records should be deleted from N records, this is the expectation.

    @aakash , Can you confirm whether the solution work or not.

     

    ~Vignesh M

     

  • Hi @Misbah , @VIGNESH.M 

     

    I appreciate your efforts and Vignesh your approach will work, but In this case I am looking for just a formula based solution, I can implement this with import easily as you suggested . I would appreciate if anyone can provide any formula based solution. 

     

    Thanks for your help

    Aakash Sachdeva

  • @VIGNESH.M

     

    What threw me off was Aakash mentioning the condition that any two can be checked (Condition number 3).

     

    clipboard_image_0.png

     

    Anyway if Aakash believes that it exactly fits his requirement then case closed. Awesome & Good Job.

     

    Thanks,

    Misbah

  • @aakash ,

     

    If I am understanding this correctly, you can concatenate the fields you need and encapsulate them within the IsFirstOccurrence() formula as the first parameter.  This should get you the same result without using the Rank function.

     

    Hope this helps,

     

    Rob

  • Hi all

    @VIGNESH.M 

    Thank you so much for the efforts , both your solutions work perfectly.

    @Misbah 

    Thanks , the case you pointed out is working perfectly as I required , the requirement is to remove N-1 entries always.

    @rob_marshall 

    Thanks , ISFIRSTOCCURRENCE was the first thing that I thought about while implementing this, I have already tried concatenating the two columns and using ISFIRSTOCCURRENCE  but it fails for a specific case, e.g I have first data as 14 and second data as 0 , 14 is not repeated anywhere in the First Data, hence 14-0 should be unchecked. Next consider the combination as 15-0 and 15-1 , in this case both are occurring only one time, but the one with "0" should be checked. Hence I could not use that Function.

     

    Let me re explain the requirements in a generic manner that will help you understand my case more clearly.

    1. I need to remove duplicates so that one value should occur only once in "First Data" column.

    2. In case of repetition in first data column, consider second data, if any of them contains a non zero value(for same value of FirstData) , that one should stay and other one should be deleted. (only one of them will contain a non zero value at a time)

    3. If in second data , all contains only "0" s , then I need to keep one of them and remove others.

     

    All the requirements have been satisfied with @VIGNESH.M 's solution.

     

    Thanks and Regards

    Aakash Sachdeva