Hiya,
I need to know something.
First I thought Sitemap Xml files will list all .html and .hml and .shtm and .shtml files. All pages of the website.
But now I see, Sitemap xml files also list other xml files. Check this one out for what I mean:
https://www.rocktherankings.com/sitemap_index.xml
So that means, I got to program my web crawler to go one level deep to find the site links (html files).
Question is: Does this happen more than one level deep ?
I mean does it do this ....
I go to a Sitemap xml file.
I see further xml files. I clickover to an xml file. Thus go one level deep.
I see more xml files listed. I clickover to an xml file. Thus go two level deep.
How many levels deep can a site go like this to list their html files ?
I need to know this to program my crawler how many levels it should check before giving up. Do not want to be going in an endless loop and get my crawler get into a trap.