Using the HTMLParser in Python

Question

delucasvb 0 Newbie Poster

15 Years Ago

Hi,

This is my first post here, since I am new to Python. I've been messing around a bit with it and I think I have the basics in my fingers now.

I've run into a problem with the HTMLParser: I want to use it to collect the url's contained in <a></a> tags, which I have done successfully, but now I also want to extract every single word, that is displayed in your internet browser, from a HTML-file. So not the <br />, , ... tags, but just the text that can be seen by any visitor.

Can I use the HTMLParser for this?

Many thanks in advance!

html-css python

2 Contributors
1 Reply
78 Views
16 Hours Discussion Span
Latest Post 15 Years Ago Latest Post by d5e5

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

d5e5 109 Master Poster · Answer 1 · 2010-05-08T02:08:39+00:00

You may want to look at http://svn.w4py.org/ZPTKit/trunk/ZPTKit/htmlrender.py I haven't tested it, but it does look like it uses HTMLParser to do what you want.