problem in copying domain name from one notepad to another using regex

Question

srinu_1 0 Newbie Poster

10 Years Ago

text1.txt:
line1 hdfbghasbfas
line2 jdsbvbsf
line3 <match name="item1" rhs="domain.com"></match>
line4 <match name="item2" rhs="domainn.com"></match>
line5 <match name="item2" rhs="1010data.com"></match>

need to retrive domain.com,domainn.com,1010data.com to "result.txt"

``

import re
f1 = open("C:/Users/Netskope/Desktop/m/test1.txt", "r")
f2 = open("C:/Users/Netskope/Desktop/m/result.txt", "w")
d1 = f1.readlines()
for line in d1:
    match = re.findall('<match name="item1" rhs="(\w.+")', line)
    if match in line:
    print match,
   f2.write(match)
#f1.close()   
}

TypeError: expected a character buffer object`Inline Code Example Here`

python

3 Contributors
4 Replies
372 Views
1 Day Discussion Span
Latest Post 10 Years Ago Latest Post by srinu_1

Gribouillis 1,391 Programming Explorer

10 Years Ago

Hm, a few visible issues:

Indentation of lines 8 and 9 is incorrect.
match is not a string, but a match object. Line 9 should probably be f2.write(match.group(0)) (or 1 depending on what you want)
Always use raw strings in regex, such as re.findall(r'...' ...). Raw strings preserve backslashes, and this is what the regex parser needs.
The curly brace at line 11 is a syntax error.

Edited 10 Years Ago by Gribouillis

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

snippsat 661 Master Poster · Answer 1 · 2013-12-28T14:15:10+00:00

It is better to use re.finditer when iterate over text and save to file.
Look like this.

import re

data = '''\
line2 jdsbvbsf
line3 <match name="item1" rhs="domain.com"></match>
line4 <match name="item2" rhs="domainn.com"></match>
line5 <match name="item2" rhs="1010data.com"></match>'''

with open('result.txt', 'w') as f_out:
    for match in re.finditer(r'rhs="(.*)"', data):
        f_out.write('{}\n'.format(match.group(1)))

'''Output-->
domain.com
domainn.com
1010data.com
'''

srinu_1 0 Newbie Poster · Answer 2 · 2013-12-29T10:25:08+00:00

hi,

please help me to write the code for:

text1.txt:
line1 <data>
line2 <items>
line3 <match name="item1" rhs="domain.com"></match>
line4 <match name="item2" rhs="domainn.com"></match>
line5 <match name="item2" rhs="1010data.com"></match>
line6 </items>
line7 </data>

text2.txt:
line1 djshjsdf
line2 sdfngjfg

check domain.com,domain.com,1010data.com in text2.com, if not there print domain.com,domain.com,1010data.com in to the 3rd text file

srinu_1 0 Newbie Poster · Answer 3 · 2013-12-30T04:35:25+00:00

hi,

please help me :::

text1.txt:
line1 <data>
line2 <items>
line3 <match name="item1" rhs="domain.com"></match>
line4 <match name="item2" rhs="domainn.com"></match>
line5 <match name="item2" rhs="1010data.com"></match>
line6 </items>
line7 </data>

text2.txt:
line1 djshjsdf
line2 sdfngjfg

check domain.com,domain.com,1010data.com in text2.com, if not there print domain.com,domain.com,1010data.com in to the 3rd text file

import re
with open('C:\\Users\\Netskope\\Desktop\\m\\test1.txt', 'r') as f_in:
    with open('C:\\Users\\Netskope\\Desktop\\m\\test_compare.txt', 'r') as f_compare:
        with open('C:\\Users\\Netskope\\Desktop\\m\\result.txt', 'w') as f_out:
            d1 = f_in.read()
            d2 = f_compare.read()
            for match in re.finditer(r'rhs="(.*)"', d1):
                for match1 in re.finditer(r'rhs="(.*)"', d2):
                    if match != match1:
                        f_out.write('{}\n'.format(match.group(1)))

while running, it could not through any error/output

what are the changes required?????