Hi, I'm writing a web crawling program for my personal site, and I'm looking at using regex to extract the URLs. However, I have both absolute and relative URLs, and I want to match URLs only on my site (mysite.com).
So it would match:
/index.php
image1.jpg
page1.html
Http://mysite.com/
Http://mysite.com/page1.html
Http://Wiki.mysite.com/
Wiki.mysite.com/
but it wouldn't match:
Bob
Www.google.com
Mailto:Admin@mysite.com
Can anyone give me assistance? I'd post what I have so far, but it is this:
Nothing.