I want to iterate through the rows in two csv files and test values. For every row in file 1 that has the same value in Cell A as a row in file 2, I want to check to see if the file 2 value in Cell C is larger. If it is larger, then I want to print that whole row from file 2 to a new output file.
Example files
File 1
ID,Begin,End
2563,15,16
2580,27,30
2580,67,90
File 2
ID,Begin,End
2578,54,70
2580,102,104
2580,48,100
Output File
ID,Begin,End
2580,48,90
In the example, only one row in File 2 met all conditions. I’ve written a Python script to do this with two for-loops, one embedded within the other. However, while the second for-loop properly iterates through all rows in file2, the first for-loop does not iterate, it just looks at the first row and never moves on.
import string, sys, os
import csv
file_1 = csv.reader(open('file_1.csv', 'rb'))
file_2 = csv.reader(open('file_2.csv', 'rb'))
for ID1, begin1, end1 in file_1:
for ID2, begin2, end2 in file_2:
print 'ID1 = ', ID1
print ' end2 = ', end2
if ID1 == ID2:
if end2 > end1: print 'out'
else: print 'nope'
This script produces the following output where ID1 is always the string "ID" (the first row of the file contains the names) and never iterates through the other rows. I assume that end1 is always the string "end", so it too never properly meets the end2 > end1 condition.
ID1 = ID
end2 = End
ID1 = ID
end2 = 70
nope
ID1 = ID
end2 = 104
nope
ID1 = ID
end2 = 100
nope
Any help would be appreciated. Thanks