Hi Everyone,
I have got a program which takes a html file as an argument, parses it, and outputs the data to a CSV file. It does this no problem. BUT, i need it to take more than one html file, parse them and put all the data collected into one CSV file.
I have tried just reproducing the code that i have for creating the csv file, but replacing the .write with .append, but this throws up an error.
The following is the code for reading the html file and writing the CSV file:
if __name__ == "__main__":
try: # Put getopt in place for future usage.
opts, args = getopt.getopt(sys.argv[1:],None)
except getopt.GetoptError:
print usage(sys.argv[0]) # print help information and exit:
sys.exit(2)
if len(args) == 0:
print usage(sys.argv[0]) # print help information and exit:
sys.exit(2)
html_files = glob.glob(args[0])
for htmlfilename in html_files:
outputfilename = os.path.splitext(htmlfilename)[0]+'.csv'
parser = html2csv()
print 'Reading %s, writing %s...' % (htmlfilename, outputfilename)
try:
htmlfile = open(htmlfilename, 'rb')
csvfile = open( outputfilename, 'w+b')
data = htmlfile.read(8192)
while data:
parser.feed( data )
csvfile.write( parser.getCSV() )
sys.stdout.write('%d CSV rows written.\r' % parser.rowCount)
data = htmlfile.read(8192)
csvfile.write( parser.getCSV(True) )
csvfile.close()
htmlfile.close()
except:
print 'Error converting %s ' % htmlfilename
try: htmlfile.close()
except: pass
try: csvfile.close()
except: pass
print 'All done. '
Anyone have any advice in how to get the program to take more arguements and process them in the same ay as above and then append the data onto the end of the CSV file?
Thanks in advance for any help. it is really appreciated!!
Shaun