This forum is now read-only. Please use our new forums at discuss.codecademy.com.

1595 points
50faf88e46ce3ec6a6000c33_424995552
Submitted by
Ivan Chernikov
over 6 years ago

Effective way to remove duplicates from an original list while (backward) iterating

With help of the link I found rather effective solution (by both performance and memory utilization):

def my4_remove_duplicates(lst):
    lst.sort()
    i = len(lst) - 1
    while i > 0:  
        if lst[i] == lst[i - 1]:
            lst.pop(i)
        i -= 1
    return lst

8 votes

permalink

Here is my answer, I ended up using "not in"

def remove_duplicates(numbers):
    newlist = []
    for number in numbers:
       if number not in newlist:
           newlist.append(number)
    return newlist
remove_duplicates([1,2,3,4,5,5,5,6,3,2])

501 points
51b557b08c1ccc4ff601f672_495647278
Submitted by
Christopher Ciampoli
almost 6 years ago


4 votes

permalink

Alternatively, use sets. Sets can't contain duplicate elements.
Convert the original list to a set and back to a list ;)

def remove_duplicate(alist):
    return list(set(alist))

2333 points
E8a5814dd58c8afbc2af00091dfd6cfa?s=140&d=retro
Submitted by
Klaus Warzecha
over 6 years ago

1 Comment

50faf88e46ce3ec6a6000c33_424995552 Ivan Chernikov over 6 years ago

Thanks, Klaus. But it is too easy to use a built-in Python function/constructor set(). The more, the current Python course of CA does not teaches us any set types. I think we should find a solution ourselves as much as possible to get our brains switching on ;-). One of the aim of my example is to show a way modifying iterable object/list on the fly, not just to get one line solution. Look at the thread http://www.codecademy.com/ru/forum_questions/5055ac0237e190000201c873


1 vote

permalink

This is the refined (and wrong, as @jwoody has noticed :-) version of @jwoody

def refined_remove_duplicates(yourList):
    y = len(yourList) - 1
    while y > 0:
        num_of_duplicates = yourList.count(yourList[y])
        if num_of_duplicates > 1:
            yourList.remove(yourList[y]) """ this line would be correct if there was a removeall() function in the Python library, or if a programmer realized it  :-) """
            y -= num_of_duplicates
        else:
            y -= 1
    return yourList

To eliminate exhaustive computing with count() function (in case of presence of duplicates) we may do as follows (but there may be a memoization in the original @jwoody's version, by now I am not a competent though):

def refined2_remove_duplicates(yourList):
    y = len(yourList) - 1
    while y > 0:
        num_of_duplicates = yourList.count(yourList[y])
        while num_of_duplicates > 1:
            yourList.remove(yourList[y])
            num_of_duplicates -= 1
            y -= 1
        y -= 1
    return yourList

Below is the exact @jwoody version in that here it calls count() function in each while loop's iteration but without extra/inner while loop:

def refined_remove_duplicates(yourList):
    y = len(yourList) - 1
    while y > 0:
        if yourList.count(yourList[y]) > 1:
            yourList.remove(yourList[y])
        y -= 1
    return yourList 

1595 points
50faf88e46ce3ec6a6000c33_424995552
Submitted by
Ivan Chernikov
over 6 years ago

4 Comments

713cfb2d0ff47287d0afc30eaaab9275?s=140&d=retro jwoody over 6 years ago

I don't think this will work. With a list like this, myList = [1, 1, 2, 1, 3, 4, 1, 3, 2, 6]
I get this [1, 1, 4, 1, 3, 2, 6]

50faf88e46ce3ec6a6000c33_424995552 Ivan Chernikov over 6 years ago

Oh my God! :-) You are absolutely right! I have forgotten that method remove(x) removes only the first occurrence of x, not all of them. But even if it was so, my version would not work correctly, it would simply delete all uniques of amount > 1 :-) In any case there is no sense in inner while loop, that was that I wanted to accent to. Here is the correct example http://labs.codecademy.com/9zd/2#:workspace

713cfb2d0ff47287d0afc30eaaab9275?s=140&d=retro jwoody over 6 years ago

ok! That works. Thanks for the correction. I guess I'm over thinking the assignments. Not the first time I've done that.

50faf88e46ce3ec6a6000c33_424995552 Ivan Chernikov over 6 years ago

If you don't mind, I've made some correction to my post and to the example http://labs.codecademy.com/9zd/3#:workspace


0 votes

permalink

this is my version:

def remove_duplicates(yourList):
y = len(yourList) - 1
while y > 0:

while yourList.count(yourList[y]) > 1:
yourList.remove(yourList[y])
y -= 1
y -= 1
return yourList

309 points
713cfb2d0ff47287d0afc30eaaab9275?s=140&d=retro
Submitted by
jwoody
over 6 years ago

2 Comments

50faf88e46ce3ec6a6000c33_424995552 Ivan Chernikov over 6 years ago

The more duplicates are in yourList the less performance your algorithm produces. There are the count() method and inner while loop, which consume more time to compute. Think of the following http://labs.codecademy.com/9zd/2#:workspace

50faf88e46ce3ec6a6000c33_424995552 Ivan Chernikov over 6 years ago

As for calling count() in each iteration, perhaps there is no extra computing, because in each next iteration count() "looks through" not the whole yourList, but through the result of count() function of previous iteration. It is so called "memoization" http://en.wikipedia.org/wiki/Memoization. But I am not sure that this is the case.