i have about a 1000 excel files that have data sets.it was collected from about a 20 patients. so i have a script that reads data sets for each patient from all of the excel files. im having trouble printing the values in separate excel files that each file represents each patient.
Gribouillis 1,391 Programming Explorer Team Colleague
It's difficult to help you since we don't know how your excel files are formatted nor the format of the output files that you want. Assuming you have a function all_files() which lists your excel files and a function find_patient(filename) which extracts the patient identification from a file, you can sort your files by patient
from itertools import groupby
def by_patient():
L = sorted(all_files(), key = find_patient)
return list((patient_id, list(group)) for patient_id, group in groupby(L, find_patient))
Then you can traverse the files with a loop
for patient_id, filenames in by_patient():
# here create an output file
for filename in filenames:
# add the content of filename to the output file
Edit: also note that "Help with patient records in excel files" is a much better forum thread title than "Python, Help me please!!" :)
Edited by Gribouillis because: n/a
g_amanu 0 Newbie Poster
Thank you....my excel files have more than a 1000 numbers under the columns that are labeled x, y, z for each excel file. i was trying to print the x, y, and z values for all of the excel files in to a single excel file(Trying_excel.xls)that would have all of the numbers under the x,y,z column for all of the excel files. This is what i have so far...but it doesnt seem to work...
import xlrd
import os
path = "c:\\Hello\\RawTrackingData"
dirList=os.listdir(path)
f = open('C:\\Hello\\Other_files\\Trying_excel.xls', 'w')
f.write('Xvalue, Yvalues, Zvalue')
f.write("\n")
Col_values=[]
Col_values1=[]
Col_values2=[]
for file in dirList:
fullpath = os.path.join(path,file)
if os.path.isfile(fullpath) == 1:
wb = xlrd.open_workbook(fullpath)
wb.sheet_names()
sh = wb.sheet_by_name(u'RawTrackingData')
for j in range(21,sh.nrows):
Col_values.append(sh.cell(j,0).value)
Col_values1.append(sh.cell(j,1).value)
Col_values2.append(sh.cell(j,2).value)
a = Col_values
b = Col_values1
c = Col_values2
f.write(str(a))
f.write(", ")
f.write(str(b))
f.write(", ")
f.write(str(c))
f.write(", ")
f.write("\n")
Gribouillis 1,391 Programming Explorer Team Colleague
Thank you....my excel files have more than a 1000 numbers under the columns that are labeled x, y, z for each excel file. i was trying to print the x, y, and z values for all of the excel files in to a single excel file(Trying_excel.xls)that would have all of the numbers under the x,y,z column for all of the excel files. This is what i have so far...but it doesnt seem to work...
import xlrd import os path = "c:\\Hello\\RawTrackingData" dirList=os.listdir(path) f = open('C:\\Hello\\Other_files\\Trying_excel.xls', 'w') f.write('Xvalue, Yvalues, Zvalue') f.write("\n") Col_values=[] Col_values1=[] Col_values2=[] for file in dirList: fullpath = os.path.join(path,file) if os.path.isfile(fullpath) == 1: wb = xlrd.open_workbook(fullpath) wb.sheet_names() sh = wb.sheet_by_name(u'RawTrackingData') for j in range(21,sh.nrows): Col_values.append(sh.cell(j,0).value) Col_values1.append(sh.cell(j,1).value) Col_values2.append(sh.cell(j,2).value) a = Col_values b = Col_values1 c = Col_values2 f.write(str(a)) f.write(", ") f.write(str(b)) f.write(", ") f.write(str(c)) f.write(", ") f.write("\n")
Here is a program which works for me, using xlrd and xlwt
import os
from os.path import join as pjoin
import xlrd, xlwt
SOURCE_DIR = "/home/eric/Documents/Daniweb/t377865/ezrfuhefuh"
OUTPUT_DIR = "/home/eric/Documents/Daniweb/t377865/output"
def excel_filenames(dire):
return sorted(pjoin(dire, name)
for name in os.listdir(dire) if name.endswith(".xls"))
def data_sheets(dire):
for filename in excel_filenames(dire):
wb = xlrd.open_workbook(filename)
yield wb.sheet_by_name(u"RawTrackingData")
def value_triples(dire):
for sh in data_sheets(dire):
for j in xrange(21, sh.nrows):
yield tuple(sh.cell_value(j, i) for i in range(3))
def create_output(srcdir, dstdir, workbook_name, sheet_name):
wb = xlwt.Workbook()
try:
sh = wb.add_sheet(sheet_name)
for col, val in enumerate("Xvalue Yvalue Zvalue".split()):
sh.write(0, col, val)
for row, triple in enumerate(value_triples(srcdir), 1):
for col, val in enumerate(triple):
sh.write(row, col, val)
finally:
wb.save(pjoin(dstdir, workbook_name))
if __name__ == "__main__":
create_output(SOURCE_DIR, OUTPUT_DIR, "Trying_excel.xls", u"TrackingData")
Notice that the rows and columns in xlrd and xlwt are numbered from 0, so the row 21 is named 22 in openoffice. You will need to add the part which sorts the files patient by patient, etc. I attach my whole test data
This attachment is potentially unsafe to open. It may be an executable that is capable of making changes to your file system, or it may require specific software to open. Use caution and only open this attachment if you are comfortable working with zip files.
peter_budo 2,532 Code tags enforcer Team Colleague Featured Poster
Thread closed as there is double of it here. Please follow discussion there.
@g_amanu Do not multi post same question
Be a part of the DaniWeb community
We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.