There seems to be a variety of ways to do this on the net, does anybody have a tested solution?
Suggestions to look at various modules and such appreciated. I don't need the full code.
There seems to be a variety of ways to do this on the net, does anybody have a tested solution?
Suggestions to look at various modules and such appreciated. I don't need the full code.
have a read through this
http://groups.google.com/group/python-excel/browse_thread/thread/eb3475b5438c3e50
the guy convert xlsx to xls and from there you can convert the xls into csv using xlrd
OK, so I used the info here:
http://groups.google.com/group/python-excel/browse_thread/thread/eb3475b5438c3e50
To modify the script to batch save .xlsx to .csv It was a tremendous help and I thought I'd share my code here.
You'll need Python for Windows (I used 2.6) and the pywin32 extensions...easily downloadable off the internet. Install them on the Windows PC that has Excel 2007 on it. Save the following file as "xls2csv.py" into the directory that contains all your .XLSX files. This script will not delete the original .XLSX files.
# You must have Python for Windows (I used 2.6) and pywin32 extensions
# installed and Excel 2007 on a Windows PC
# Put this script into the dir where all the .XLSX files are and then cd to that dir
# Usage: c:\python26\python.exe xls2csv.py
import glob
import os
import time
import win32com.client
xlsx_files = glob.glob('*.xlsx')
if len(xlsx_files) == 0:
raise RuntimeError('No XLSX files to convert.')
xlApp = win32com.client.Dispatch('Excel.Application')
for file in xlsx_files:
xlWb = xlApp.Workbooks.Open(os.path.join(os.getcwd(), file))
xlWb.SaveAs(os.path.join(os.getcwd(), file.split('.xlsx')[0] +
'.csv'), FileFormat=6)
xlApp.Quit()
time.sleep(2) # give Excel time to quit, otherwise files may be locked
# Uncomment the two lines below if you want the script to remove
# the orig .xlsx files when done
#for file in xlsx_files:
# os.unlink(file)
Enjoy all.
Spindrift
One thing that needs to be fixed here is that the .CSV files are not closing when the python script completes. It appears that some instances of Excel were staying open too.
Winzip was warning that the .CSV files I were adding were possibly corrupt. I worked around it by logging out and back into Windows.
If any python gurus know how to fix this in the code, I'd appreciate you sharing with us.
Thanks,
Spindrift
to the linux users, who don't have access to those windows modules basically those files (xlsx) are zipped files that contain xml files where the data is contained, so essentially you could write a programme that unzips the xlsx file and parse the xml files contained in the "xl/worksheets" folder to your preferred xml module parser. i used xml.dom.minidom module and i've tested it with the etree module.
Hi, I wrote something on using Python for xlsx files a while ago - hope you can use it: http://blog.codeus.net/reading-xlsx-files-from-python/
Cheers,
Alex
xlWb.Close(SaveChanges = True)
insert this line before xlWb.Quit() to completely close excel
thanks!
CATR
I would recommend you to use this XLSX to CSV converter, it can also convert csv files to xlsx or xls format.
We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.