I am trying to use BeautifulSoup:
soup = BeautifulSoup(page)

td_tags = soup.findAll('td')
i=0
for td in td_tags:
i = i+1
print "td: ", td
# re.search('[0-9]*\.[0-9]*', td)
price = re.compile('[0-9]*\.[0-9]*').search(td)

I am getting an error:

price= re.compile('[0-9]*\.[0-9]*').search(td)
TypeError: expected string or buffer

soup = BeautifulSoup(page) 
 
    td_tags = soup.findAll('td') 
     i=0 
     for td in td_tags: 
             i = i+1 
             print "td: ", td 
         #  re.search('[0-9]*\.[0-9]*', td) 
             price = re.compile('[0-9]*\.[0-9]*').search(td)

I am getting an error:

price= re.compile('[0-9]*\.[0-9]*').search(td)
TypeError: expected string or buffer

td is an object. I need to pass a string into re.search. I need to pass a string to re.compile().search(). Anyone how I can convert object to string?
Anyone know how to fix this?

Thank you for reposting via [code] tags, and using an appropriate title.

td is an instance, not a string.

You probably want to say price = re.compile('[0-9]*\.[0-9]*').search(td.contents[0]), and/or iterate over td.contents[] (via [inlinecode]for tdc in td.contents:, say).

You know, of course, that you can write something like mypat = re.compile('[0-9]*\.[0-9]*') and then later just do mypat.search(td.contents[0]), which is usually more efficient in both memory and CPU, if you call re.compile() more than once.

mypat = re.compile('[0-9]*\.[0-9]*')

This pattern only works if it is floating point. I need construct it so it works for both decimal and whole numbers.

I have scenario where I could get either decimal numbers(1234.890) or whole numbers(10000).

How do you get the proper regex?

There's lots of ways, depending on how paranoid you need to be. One way, you make the decimal and following optional, as in: mypat = re.compile('[0-9]*(\.[0-9]*|$)') , with or without the $ depending on your needs. Regexp matching will try to grab as much as it can, which does what you want here.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.