I am wanting to make a search for a website, and I have a plan on how to do it but I am not sure on how to go about indexing the site. My idea is to run a script through the pages that parses everything between the <p> and </p> tags and places each paragraph separately into a database.
I don't really know what function would help me do this, if anyone could just point me in the right direction on what I should be reading to make this happen it would be very helpful.