hi, im learning to use python at the moment and i came over a question where it gives me a large csv file with names of companies and how much they are earning and i was asked to find the top 10 companies..i orignially did this:

import urllib
import csv
temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
data = csv.reader(temp_url)
header = data.next()
max_company=''
max_earnings=0.0
num = 0
for row in data:
    entry = row[6]
    num += 1
    if float(entry) > max_earnings:
        max_earnings = float(entry)
        max_company = row[0]

However it seems to only give me the top company. I also tried to use a whle loop but it didn tur out right.. is there a way to reiterate through every row without the top company another 9 times? plz help! thx

I'm a little confused about the use of the "num" variable.

I would probably do this:

import urllib
import csv
temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
data = csv.reader(temp_url)
header = data.next()
earnings = [(row[0], float(row[6])) for row in data]
earnings.sort(key = lambda x: x[1])
earnings.reverse()
earnings = earnings[:10]

line 6 generates a list of the companies and their earnings. Line 7 sorts these using the earnings as the key. Line 8 puts the highest earnings first. And line 9 trims the list to the top 10.

HTH,
Jeff

I'm a little confused about the use of the "num" variable.

I would probably do this:

import urllib
import csv
temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
data = csv.reader(temp_url)
header = data.next()
earnings = [(row[0], float(row[6])) for row in data]
earnings.sort(key = lambda x: x[1])
earnings.reverse()
earnings = earnings[:10]

line 6 generates a list of the companies and their earnings. Line 7 sorts these using the earnings as the key. Line 8 puts the highest earnings first. And line 9 trims the list to the top 10.

HTH,
Jeff

hey thx for the help, also can u tell me if i can do it like this?

import urllib
import csv
temp_url = urllib.urlopen("http://app.lms.unimelb.edu.au/bbcswebdav/courses/600151_2008_1/datasets/finance/asx_2007.csv")
data = csv.reader(temp_url)
header = data.next()
max_company=''
max_earnings=0.0
num = 0
count = 0
while num < 10:
    for row in data:
        count += 1
        entry = row[6]
        if float(entry) > max_earnings:
            max_earnings = float(entry)
            max_company = row[0]
            
    num+=1
    print max_company

it still doesn work but if i cant find a way to exclude the line of data that is the max company, then when it reiterates it should print the second largest. and so on. thx again!

I don't really recommend that iterative approach, because all it accomplishes is printing the top ten to the screen (if you can get it to work!), whereas the real prize is to have the top ten in a list somewhere so that you can print it, sort it, etc.

Jeff

hey the thing is we are learning loops and iterations rigth now and i think we're supposed to use iteration to do it if possible hmm does anyone knw how to exclude a line of code from csv file? i think it should be something similar to excluding the header

Oh, well if you must iterate, then here's the basic idea:

* create an empty list.
* run through the data by rows.
* if the current row's earnings are greater than the smallest earnings in your list:
--- add the current row's earnings and name of company to the list.
--- sort the list.
--- trim the list to 10 items
* Voila!

I'll leave the coding to you.

Jeff

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.