reading a particular part of a text file

Question

jancho1911 0 Newbie Poster

16 Years Ago

Hi !

I am new to DaniWeb. I searched everything to find an answer but i can't. This is for my final project and i need it to get my bachelor degree.

I have a text file that has this type of files:

2001.7.1.407 изутрината во тетовски
2003.5.3.20083 кзк ја штити таканаречен
2001.8.7.1830 винарските визби во македонија и оваа година

I need to separate the number after the second dot ex: 1, 3, 7 and place the text after the last number in a new file which is according to the numbers (note that the numbers are categories from 1-7) so each text has to go in its own text file which will be named by its category.

I hope you understand what i mean...i can't explain it better sorry :(

here is some of the code i wrote(it can be completely wrong):

DB = open('DB.txt', 'r')
kat1 = open('kat1.txt', 'a')

poz = 0
while True:
start = DB.find ('2', poz)
if start == -1: break
najdi = DB.find ('.', start)
end = DB.find ('.', najdi)

br = DB[start:najdi:end]
poz = br [br.find('.') + 1]

THANK YOU IN ADVANCE!!!

python

3 Contributors
8 Replies
146 Views
2 Days Discussion Span
Latest Post 16 Years Ago Latest Post by zachabesh

All 8 Replies

woooee 814 Nearly a Posting Maven

16 Years Ago

For future reference, you do not want to assume that the characters you search for are found. This is better IMHO, although I too would prefer to use .split(".")

if start == -1: 
   break
najdi = DB.find ('.', start)
if najdi > -1:
      end = DB.find ('.', najdi)
      if end > -1:
         br = DB[start:najdi:end]

woooee 814 Nearly a Posting Maven

16 Years Ago

zachabesh meant something like this, using 2 splits per record

test_data = [
"2001.7.1.407 изутрината во тетовски\n",
"2003.5.3.20083 кзк ја штити таканаречен\n",
"2001.6.1.407 test rec #1\n",
"2003.10.3.20083 test rec #2\n",
"2001.11.7.1830 винарските визби во македонија и оваа година\n"]

for rec in test_data:
   rec=rec.strip()
   dots_split = rec.split(".")
   print "\ndots_split =", dots_split
   if len(dots_split) > 3:
      print "year=%s,  month=%s,  catagory=%s" % \
            (dots_split[0], dots_split[1], dots_split[2])
      space_split = dots_split[3].split()
      if len(space_split) > 1:
         print "     id=%s,  name=%s" % (space_split[0], " ".join(space_split[1:]))
         
      else:
         print "space split error", dots_split[3]
   else:
      print "data error", rec

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

zachabesh 5 Junior Poster · Answer 1 · 2008-09-10T22:13:38+00:00

Heh heh, this is essentially what I do every day at work, and I feel like you're making it much more complicated than it really is.

Here's what I would suggest: split your text file into a linelist (maybe using file.readlines())

then, write a function that takes a line and outputs what you want, run this function for every line in your linelist. so:

do('2001.7.1.407 изутрината во тетовски') might return a tuple:

(numberspart,letterspart)

the second number would be numberspart.split('.')[1]

Open a file using that variable as the filename and write the textpart!

jancho1911 0 Newbie Poster · Answer 2 · 2008-09-10T23:30:36+00:00

I knew that it was simpler then i tought...but it seemed so difficult to me...anyways

thank you
J

jancho1911 0 Newbie Poster · Answer 3 · 2008-09-11T15:14:06+00:00

There is a ploblem. You see, there are files that are like the one that i showed you '2001.7.1.407 and text' (2001 is the year, 7 is the month, 1 is the category and 407 is the id of the news) but, months can be with either 1 or 2 numbers so i have to do it probably by counting the dots or i don't know what!!!
pls help me!!!
J

jancho1911 0 Newbie Poster · Answer 4 · 2008-09-12T15:47:10+00:00

Thank you very much...this helped a lot!!! huh

One more question: Is it possible to separate text from numbers???

thanks again

woooee 814 Nearly a Posting Maven · Answer 5 · 2008-09-12T23:51:08+00:00

You can use string_var.isdigit() and string_var.isalpha(), or use a try/except.

try:
   float_var=float(string_var)
   print float_var, "is a number"
except:
   print string_var, "is a string"

zachabesh 5 Junior Poster · Answer 6 · 2008-09-13T00:23:37+00:00

Theres another thread which has great example code to decide what a type a variable is...let me grab that link.

EDIT: Great stuff in this one: http://www.daniweb.com/forums/thread145297.html

reading a particular part of a text file

Recommended Answers Collapse Answers

All 8 Replies

Recommended Answers