Delete Icon
This forum is now read-only. Please use our new forums at discuss.codecademy.com.
0 points
Submitted by Ivan Chernikov
8 years ago

Effective way to remove duplicates from an original list while (backward) iterating

With help of the link I found rather effective solution (by both performance and memory utilization):

def my4_remove_duplicates(lst):
    lst.sort()
    i = len(lst) - 1
    while i > 0:  
        if lst[i] == lst[i - 1]:
            lst.pop(i)
        i -= 1
    return lst

Answer 51b52b7b7c82ca607c01d5dc

8 votes

Permalink

Here is my answer, I ended up using "not in"

def remove_duplicates(numbers):
    newlist = []
    for number in numbers:
       if number not in newlist:
           newlist.append(number)
    return newlist
remove_duplicates([1,2,3,4,5,5,5,6,3,2])
501 points
7 years ago

Answer 50c5cdadc8fd23f81e001884

4 votes

Permalink

Alternatively, use sets. Sets can’t contain duplicate elements. Convert the original list to a set and back to a list ;)

def remove_duplicate(alist):
    return list(set(alist))
2333 points
Submitted by Klaus Warzecha
8 years ago

1 comments

Ivan Chernikov 8 years ago

Thanks, Klaus. But it is too easy to use a built-in Python function/constructor set(). The more, the current Python course of CA does not teaches us any set types. I think we should find a solution ourselves as much as possible to get our brains switching on ;-). One of the aim of my example is to show a way modifying iterable object/list on the fly, not just to get one line solution. Look at the thread http://www.codecademy.com/ru/forum_questions/5055ac0237e190000201c873

Answer 50ebff9a89df20955f0025cb

1 vote

Permalink

This is the refined (and wrong, as @jwoody has noticed :-) version of @jwoody

def refined_remove_duplicates(yourList):
    y = len(yourList) - 1
    while y > 0:
        num_of_duplicates = yourList.count(yourList[y])
        if num_of_duplicates > 1:
            yourList.remove(yourList[y]) """ this line would be correct if there was a removeall() function in the Python library, or if a programmer realized it  :-) """
            y -= num_of_duplicates
        else:
            y -= 1
    return yourList

To eliminate exhaustive computing with count() function (in case of presence of duplicates) we may do as follows (but there may be a memoization in the original @jwoody’s version, by now I am not a competent though):

def refined2_remove_duplicates(yourList):
    y = len(yourList) - 1
    while y > 0:
        num_of_duplicates = yourList.count(yourList[y])
        while num_of_duplicates > 1:
            yourList.remove(yourList[y])
            num_of_duplicates -= 1
            y -= 1
        y -= 1
    return yourList

Below is the exact @jwoody version in that here it calls count() function in each while loop’s iteration but without extra/inner while loop:

def refined_remove_duplicates(yourList):
    y = len(yourList) - 1
    while y > 0:
        if yourList.count(yourList[y]) > 1:
            yourList.remove(yourList[y])
        y -= 1
    return yourList 
1595 points
Submitted by Ivan Chernikov
8 years ago

4 comments

jwoody 8 years ago

I don’t think this will work. With a list like this, myList = [1, 1, 2, 1, 3, 4, 1, 3, 2, 6] I get this [1, 1, 4, 1, 3, 2, 6]

Ivan Chernikov 8 years ago

Oh my God! :-) You are absolutely right! I have forgotten that method remove(x) removes only the first occurrence of x, not all of them. But even if it was so, my version would not work correctly, it would simply delete all uniques of amount > 1 :-) In any case there is no sense in inner while loop, that was that I wanted to accent to. Here is the correct example http://labs.codecademy.com/9zd/2#:workspace

jwoody 8 years ago

ok! That works. Thanks for the correction. I guess I’m over thinking the assignments. Not the first time I’ve done that.

Ivan Chernikov 8 years ago

If you don’t mind, I’ve made some correction to my post and to the example http://labs.codecademy.com/9zd/3#:workspace

Answer 50eaf8a18c562bdf62000127

0 votes

Permalink

this is my version:

def remove_duplicates(yourList): y = len(yourList) - 1 while y > 0:
while yourList.count(yourList[y]) > 1: yourList.remove(yourList[y]) y -= 1 y -= 1 return yourList

309 points
Submitted by jwoody
8 years ago

2 comments

Ivan Chernikov 8 years ago

The more duplicates are in yourList the less performance your algorithm produces. There are the count() method and inner while loop, which consume more time to compute. Think of the following http://labs.codecademy.com/9zd/2#:workspace

Ivan Chernikov 8 years ago

As for calling count() in each iteration, perhaps there is no extra computing, because in each next iteration count() “looks through” not the whole yourList, but through the result of count() function of previous iteration. It is so called “memoization” http://en.wikipedia.org/wiki/Memoization. But I am not sure that this is the case.