Python

Question

heheja 0 Newbie Poster

12 Years Ago

Hey,

Do any one know how to get only Lang ID from Google chrome site ("view-source:https://www.google.com/chrome?hl=en-GB") using this regex "<option value=""([a-zA-Z])"">[&#0-9; a-zA-Z()]</option>" ?

python

2 Contributors
3 Replies
164 Views
44 Minutes Discussion Span
Latest Post 12 Years Ago Latest Post by heheja

Gribouillis 1,391 Programming Explorer

12 Years Ago

You can't double the double quotes like this

>>> r"<option value=""([a-zA-Z])"">[&#0-9; a-zA-Z()]</option>" # bad
'<option value=([a-zA-Z])>[&#0-9; a-zA-Z()]</option>'
>>> r'<option value="([a-zA-Z])">[&#0-9; a-zA-Z()]</option>' # good
'<option value="([a-zA-Z])">[&#0-9; a-zA-Z()]</option>'

Use kodos to debug regexes.

edit: in python, r"foo""bar""baz" is the same as r"foo" + "bar" + "baz".

Edited 12 Years Ago by Gribouillis

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

heheja 0 Newbie Poster · Answer 1 · 2012-07-25T07:55:25+00:00

I have tried following but did not work

def getsourcecode():
  url ="https://www.google.com/chrome?hl=da"
  req = urllib2.Request(url, None)
  source_code = urllib2.urlopen(req).read()
  #return (source_code)

  for line in getsourcecode: 
  matchObj = re.match(r"<option value=""([a-zA-Z]*)"">[�-9; a-zA-Z()]*</option>", line, re.M|re.I)

  if matchObj:
    print "matchObj.group(1) : ", matchObj.group(1)

  else:
    print "No match!!"

heheja 0 Newbie Poster · Answer 2 · 2012-07-25T08:31:10+00:00

my mistake, just tried and did't worked, and i have tested regex its working.

def getsourcecode():
  url ="https://www.google.com/chrome?hl=da"
  req = urllib2.Request(url, None)
  source_code = urllib2.urlopen(req).read()
  #return (source_code)

  for line in getsourcecode: 
  matchObj = re.match(r"<option value="([a-zA-Z])">[&#0-9; a-zA-Z()]</option>", line)

  if matchObj:
    print "matchObj.group(1) : ", matchObj.group(1)

  else:
    print "No match!!"