I'm looking for an open source webcrawler which can be easily modified. Is anyone aware of a good open source crawler? I'm just enrolled in a dataminning course and would like to develop an algorithm which categorizes web pages. I need to be able to control which pages are indexed and how those pages are categorized, both based on the page content. Thanks.
SkyRenderX 0 Newbie Poster
Be a part of the DaniWeb community
We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.