Dear Sir,
I have written a script to extract the first line starting with Source Name AND ends with Comment [ArrayExpress Data Retrieval URI] and i have done it but i could not parse distinct or unique attributes which is not repeated in every files. I would like to parse only the first line attributes not the table values. Could you please rectify this script and i would be glad for your support and cooperation. I have attached a zip file for all sdrf.txt files and the output for the script i have run. The file may be located from this url -
ftp://ftp.ebi.ac.uk/pub/databases/mi...FMX-1.sdrf.txt
Regards,
Haobijam
#!/usr/bin/python
import glob
#import linecache
outfile = open('output_att.txt' , 'w')
files = glob.glob('*.sdrf.txt')
for file in files:
infile = open(file)
#count = 0
for line in infile:
lineArray = line.rstrip()
if not line.startswith('Source Name') : continue
#count = count + 1
lineArray = line.split('%s\t')
print lineArray[0]
output = "%s\t\n"%(lineArray[0])
outfile.write(output)
infile.close()
outfile.close()