Split string and write to new file

Question

turnerca902 0 Newbie Poster

14 Years Ago

Hi everyone,

I have a fairly simple problem, but having not used python in awhile, I just can't seem to get things working.
Basically, I have a text file with a number of comma separated fields (attached).
What I want to do is split the string, and extract the "File" item from each line. I then need to write this to a new file. (I also want to skip the first line.)

So my desired output file would just have:
2008308_017_079.tif
2008308_017_080.tif
2008308_017_081.tif
etc...

If anyone out there could help me with this, I'd be very grateful!

python

cb2.txt (3.43 KB)

"Id","File","Easting","Northing","Alt","Omega","Phi","Kappa","Photo","Roll","Line","Roll_line","Orient","Camera"
1800,2008308_017_079.tif,530658.110,5005704.180,2031.100000,0.351440,-0.053710,0.086470,79,2008308,17,308_17,rightX,Jen73900229d
1801,2008308_017_080.tif,531793.060,5005709.230,2033.170000,0.385000,-0.044790,-0.057690,80,2008308,17,308_17,rightX,Jen73900229d
1802,2008308_017_081.tif,532930.810,5005709.150,2032.250000,0.350180,-0.044950,0.271100,81,2008308,17,308_17,rightX,Jen73900229d
1803,2008308_017_082.tif,534066.230,5005706.620,2037.630000,0.345480,-0.036860,0.234700,82,2008308,17,308_17,rightX,Jen73900229d
1804,2008308_017_083.tif,535212.280,5005706.990,2037.470000,0.336650,-0.045540,0.306690,83,2008308,17,308_17,rightX,Jen73900229d
1805,2008308_017_084.tif,536359.740,5005707.850,2033.760000,0.333610,-0.050390,0.086950,84,2008308,17,308_17,rightX,Jen73900229d
1806,2008308_017_085.tif,537494.570,5005708.610,2035.620000,0.343970,-0.052050,0.303690,85,2008308,17,308_17,rightX,Jen73900229d
1807,2008308_017_086.tif,538627.990,5005709.840,2035.100000,0.328450,-0.054550,-0.091990,86,2008308,17,308_17,rightX,Jen73900229d
1808,2008308_017_087.tif,539779.710,5005708.090,2030.540000,0.326280,-0.057570,0.227650,87,2008308,17,308_17,rightX,Jen73900229d
1809,2008308_017_088.tif,540906.110,5005711.370,2032.730000,0.347700,-0.029520,0.389650,88,2008308,17,308_17,rightX,Jen73900229d
2268,2008310_016_008.tif,540912.710,5003700.770,2010.400000,-0.323050,0.056930,179.710620,8,2008310,16,310_16,left+X,Jen73900229d
2269,2008310_016_007.tif,539788.120,5003693.790,2014.890000,-0.345960,0.084340,179.153550,7,2008310,16,310_16,left+X,Jen73900229d
2270,2008310_016_006.tif,538654.060,5003698.770,2027.840000,-0.331110,0.057120,179.118960,6,2008310,16,310_16,left+X,Jen73900229d
2271,2008310_016_005.tif,537504.470,5003715.740,2026.880000,-0.326870,0.043910,178.785490,5,2008310,16,310_16,left+X,Jen73900229d
2272,2008310_016_004.tif,536349.200,5003739.500,2010.940000,-0.329510,0.060200,179.274040,4,2008310,16,310_16,left+X,Jen73900229d
2273,2008310_016_003.tif,535232.560,5003746.840,2009.070000,-0.329740,0.053120,179.544540,3,2008310,16,310_16,left+X,Jen73900229d
2274,2008310_016_002.tif,534088.210,5003743.760,2024.100000,-0.326980,0.045690,179.670860,2,2008310,16,310_16,left+X,Jen73900229d
2275,2008310_016_001.tif,532945.090,5003737.280,2027.930000,-0.359200,0.060830,179.319580,1,2008310,16,310_16,left+X,Jen73900229d
2276,2008310_015_088.tif,536328.710,5001730.620,2019.370000,0.340480,-0.039560,-0.596140,88,2008310,15,310_15,rightX,Jen73900229d
2277,2008310_015_089.tif,537474.370,5001721.580,2007.930000,0.348600,-0.061310,-0.316810,89,2008310,15,310_15,rightX,Jen73900229d
2278,2008310_015_090.tif,538611.770,5001705.930,2008.260000,0.343580,-0.043240,0.696690,90,2008310,15,310_15,rightX,Jen73900229d
2279,2008310_015_091.tif,539738.100,5001707.080,2016.300000,0.351750,-0.027060,0.357080,91,2008310,15,310_15,rightX,Jen73900229d
2280,2008310_015_092.tif,540882.920,5001717.380,2024.100000,0.339980,-0.035750,0.330010,92,2008310,15,310_15,rightX,Jen73900229d
3112,2008313_014_240.tif,538621.930,4999720.280,1997.920000,4.276300,2.002480,0.107910,240,2008313,14,313_14,rightX,Jen73900229d
3113,2008313_014_241.tif,539762.130,4999724.300,1989.260000,0.458230,0.112320,-0.054790,241,2008313,14,313_14,rightX,Jen73900229d
3114,2008313_014_242.tif,540894.990,4999726.760,1994.060000,0.463020,0.106710,-0.033460,242,2008313,14,313_14,rightX,Jen73900229d

3 Contributors
8 Replies
454 Views
5 Days Discussion Span
Latest Post 14 Years Ago Latest Post by turnerca902

TrustyTony 888 ex-Moderator

14 Years Ago

Quite simple with the code snippet I posted today, only add removing of quoting.

# text based data input with data accessible
# with named fields or indexing
from __future__ import print_function ## Python 3 style printing
from collections import namedtuple
import string

filein = open("cb2.txt")
quotes = '\'\"'
datadict = {}

headerline = filein.readline().lower() ## lowercase field names Python style
## first non-letter and non-number is taken to be the separator
separator = headerline.strip(string.lowercase + string.digits + quotes)[0]
print("Separator is '%s'" % separator)

headerline = [field.strip(string.whitespace + quotes) for field in headerline.split(separator)]
Dataline = namedtuple('Dataline',headerline)
print ('Fields are:',Dataline._fields,'\n')

for data in filein:
    data = [f.strip(string.whitespace + quotes) for f in data.split(separator)]
    d = Dataline(*data)
    datadict[d.id] = d ## do hash of id values for fast lookup (key field)

for id in  datadict.keys():
    print(datadict[id].file)

input('Ready') ## let the output be seen when run directly

snippsat 661 Master Poster

14 Years Ago

One soultion with regular expression,not hard to wirte regex for this just a couple of min.

import re

text = '''\
"Id","File","Easting","Northing","Alt","Omega","Phi","Kappa","Photo","Roll","Line","Roll_line","Orient","Camera"
1800,2008308_017_079.tif,530658.110,5005704.180,2031.100000,0.351440,-0.053710,0.086470,79,2008308,17,308_17,rightX,Jen73900229d
1801,2008308_017_080.tif,531793.060,5005709.230,2033.170000,0.385000,-0.044790,-0.057690,80,2008308,17,308_17,rightX,Jen73900229d
1802,2008308_017_081.tif,532930.810,5005709.150,2032.250000,0.350180,-0.044950,0.271100,81,2008308,17,308_17,rightX,Jen73900229d
1803,2008308_017_082.tif,534066.230,5005706.620,2037.630000,0.345480,-0.036860,0.234700,82,2008308,17,308_17,rightX,Jen73900229d
1804,2008308_017_083.tif,535212.280,5005706.990,2037.470000,0.336650,-0.045540,0.306690,83,2008308,17,308_17,rightX,Jen73900229d
'''

test_match = re.findall(r'\d{7}\_\d{3}\_\d{3}\.\btif\b',text)
print test_match #Give us a list

#Looping over item in list
for item in test_match:
    print item

'''-->Out
['2008308_017_079.tif', '2008308_017_080.tif', '2008308_017_081.tif', '2008308_017_082.tif', '2008308_017_083.tif']
2008308_017_079.tif
2008308_017_080.tif
2008308_017_081.tif
2008308_017_082.tif
2008308_017_083.tif
'''

Edited 14 Years Ago by snippsat because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 1 · 2010-07-01T03:51:54+00:00

For simple, inflexible solution you can do only:

filein = open("cb2.txt")
filein.readline() # drop first line
for line in filein:
    print line.split(',')[1]

turnerca902 0 Newbie Poster · Answer 2 · 2010-07-05T18:14:16+00:00

Thanks tonyjv and Snippsat,

Your suggestions helped me get back on track.

filein = open("cb2.txt")
filein.readline()

for line in filein:
    namedata = []
    namedata = line.split(",")[1]
    print namedata + "\n"
    fileout = open("copyimg.txt" , "a")
    fileout.write(namedata + "\n")
    fileout.close()

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 3 · 2010-07-05T18:27:54+00:00

it is better though to move line 8 out of loop to line 3 with less indent. Then also mode 'w' is ok instead of 'a'. Of course closing must do after loop not inside (one indent less)

turnerca902 0 Newbie Poster · Answer 4 · 2010-07-05T18:29:41+00:00

Thanks again tonyjv!
I'll make those changes.

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 5 · 2010-07-05T21:49:54+00:00

Also print is providing the newline automatically, if you prefer you can use it also to file like this:

filein = open("cb2.txt")
filein.readline()
fileout = open("copyimg.txt" , "w")

for line in filein:
    namedata = []
    namedata = line.split(",")[1]
    print namedata
    print >>fileout,namedata

fileout.close()

turnerca902 0 Newbie Poster · Answer 6 · 2010-07-05T21:57:52+00:00

Oh, terrific!
I didn't know that was an option with "print". I know I'll use that method again in the future.
Many thanks for the great help and advice :)