In PHP, I've tried using simple_html_dom in order to extract URLs from web pages.
And it works a lot of the time, but not all of the time.
For example, it doesn't work on the website ArsTechnica.com, because it has a different use of HTML URLs.
So... one thing I do know... is that Firefox perfectly gets a list of all links on a page, hence, how you can load up a web page in Firefox, and all the links are clickable.
And so... I was wondering... is it possible to download the open source Firefox browser engine, or Chrome, or whatever... and pass some parameters to it somehow, and this will give me a list of all URLs on the page..??
I can then feed that into PHP by whatever means, whether it's shell_exec() or whatever.
Is this possible? How do I do it?