Tracking Repeats in a list

Question

hughesadam_87 54 Junior Poster

14 Years Ago

Hey guys,

Say I had the following list:

[1, 2, 2, 2, 3, 4,4] Let's assume that the list will always be integers. Does python have a built in feature which would take a sorted list, and then return the frequency that each element occurred. For example:

1 -> 1
2 -> 3
3 -> 1
4 -> 2

python

4 Contributors
11 Replies
155 Views
13 Hours Discussion Span
Latest Post 14 Years Ago Latest Post by jlm699

woooee 814 Nearly a Posting Maven

14 Years Ago

Not that I know of. You would have to use a dictionary with the integer as a key, with the value being a counter integer that you would increment each time. There is a count function for lists that would return the number of times something occurs in a list, but you would have to process the entire list however many times you count, instead of once, and you would have to keep track of what counts have already been done so you don't do them twice, and there are probably other problems that make count() more trouble than it is worth for this particular case. The other solution would be to sort the list and count until you find a difference, then print, rinse and repeat.

jlm699 320 Veteran Poster

14 Years Ago

This isn't builtin, but my first hit on a google search for 'python list frequency' was this:

>>> alist = [ '1', '1', '2', '1', '3', '4', '1', '3']
>>> [(a, alist.count(a)) for a in set(alist)]

onaclov2000

14 Years Ago

Sorry, not a python guy, kinda more of a perl guy (gonna try to learn python someday), but I expect that python has RegEx support, would it be possible to extract that data via a RegEx not sure if you can get a "match that tells you how many times it matched" query? You could just append them to a csv you can "search" each time you come upon a "new" word and if you already see it don't try to regex it......

Just a thought.

Thank you

Edit:

my $counter2 = ($string =~ s/AAA/AAA/g);

basically it looks to me like that entire "thing" will return a value, I'll have to try some testing, but basically you could do something like the above and only do it on "new" words like I was saying

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

hughesadam_87 54 Junior Poster · Answer 1 · 2009-07-24T01:08:05+00:00

Not that I know of. You would have to use a dictionary with the integer as a key, with the value being a counter integer that you would increment each time. There is a count function for lists that would return the number of times something occurs in a list, but you would have to process the entire list however many times you count, instead of once, and you would have to keep track of what counts have already been done so you don't do them twice, and there are probably other problems that make count() more trouble than it is worth for this particular case. The other solution would be to sort the list and count until you find a difference, then print, rinse and repeat.

When you say speak of the dictionary, and you say the value being a counter integer, can you elaborate on the counter integer values?

woooee 814 Nearly a Posting Maven · Answer 2 · 2009-07-24T01:41:52+00:00

[(a, alist.count(a)) for a in set(alist)]

A nice solution. Convert to a set to eliminate duplicates and then count each one. It still requires that the list be processed as many times as there are integers in the set, and would yield the results in hash table order, but the list of tuples could be easily sorted.

jlm699 320 Veteran Poster · Answer 3 · 2009-07-24T02:09:31+00:00

but the list of tuples could be easily sorted.

Yeah, that link I provided actually demonstrates a sorting method as such: sorted(_, key=lambda x: -x[1]) EDIT: By the way, the '_' should be replaced by your variable name or the data itself if you were to implement this in a script. The usage of '_' is really only useful in the interpreter as a shortcut for the last return value.

hughesadam_87 54 Junior Poster · Answer 4 · 2009-07-24T02:09:59+00:00

Thanks Jim. I originally serached "python list repeats" and got squat, but probably should have thought of a better search input. I actually wrote a small code to solve my problem, but now that I know about this, I could have saved some time.

This isn't builtin, but my first hit on a google search for 'python list frequency' was this:

>>> alist = [ '1', '1', '2', '1', '3', '4', '1', '3']
>>> [(a, alist.count(a)) for a in set(alist)]

end quote.

list = [1, 2, 3, 3, 3, 4, 5.2, 5.2, 6, 7, 7, 8]

old_temp = 'bob'
new_temp = 'jon'

for i in range(0, len(list)):   

    new_temp = list[i]

    if not old_temp == new_temp:    
        new_temp = list[i]  

        counter = 1
        j = 1

        for j in range(i+1, len(list)):
            if new_temp == list[j]:
                counter += 1


        print new_temp, 'matches', counter

        i += 1  
        old_temp = new_temp

hughesadam_87 54 Junior Poster · Answer 5 · 2009-07-24T02:59:59+00:00

OOPS one more thing:

This method, with the sets, produces output lie this:

(Name, frequency)
(Name, frequency)

How can I tell python to write this out without the paranthesis?

jlm699 320 Veteran Poster · Answer 6 · 2009-07-24T03:07:59+00:00

for element in my_set:
    print ' '.join(element)

This hinges on each element being a tuple. IT will join each index of the element together with a space in between...

hughesadam_87 54 Junior Poster · Answer 7 · 2009-07-24T03:25:39+00:00

for element in my_set:
    print ' '.join(element)
This hinges on each element being a tuple. IT will join each index of the element together with a space in between...

Let's say I used the code you posted and got this list:

my_list = [(1, 1), (2, 1), (3, 1), (4, 2)]

I then try:

for entry in my_list:
     print ' '.join(str(entry))

Which output:

( 1 ,   1 )
( 2 ,   1 )
( 3 ,   1 )
( 4 ,   2 )

I still can't get it to leave out the paranthesis.

jlm699 320 Veteran Poster · Answer 8 · 2009-07-24T08:28:54+00:00

Don't convert the tuple to a string with str . Then you should be good to go.

The join method requires an iterable. Notice all the extra spaces in your output. This is because instead of joining each element of the tuple by spaces, you've made it join each letter in the string with spaces.

Look:

>>> t = ('Two', 'words')
>>> t
('Two', 'words')
>>> str(t)
"('Two', 'words')"
>>> ' '.join(str(t))
"( ' T w o ' ,   ' w o r d s ' )"
>>> ' '.join(t)
'Two words'
>>>

Get it?