Hello,
I'm trying to scrape a website using XPath and am running into a little trouble. This is the first time I've played with XPath so I'm a little rusty :/
The relevant source code of the website I'm trying to scrape is:
<span id="ctl00_ContentPlaceHolder1_lblEvents">
<div class="contentmain">
<div class="textArea1">
<strong>14 June 2010 - 14 September 2010</strong>
<br />
<a href="events.aspx?evID=6591" class="events">Davies Display</a>
<br />
<strong>Pontypridd 2010</strong>
<br />
</div>
.. Theres more of these 'textArea1' divs, and the structure of them is the same as the one above.
</div>
... Again, there's more of these 'contentmain' div's which contain other textArea1 divs.
</span>
So far, I have created the following code which gets all the 'contenmain' divs.
// Get the whole page source in order to filter out events
$RCT_Source = new DOMDocument;
$RCT_Source->loadHTMLFile('http://domain.co.uk/events.aspx');
$XPath = new DOMXPath($RCT_Source);
$Event_List = $XPath->query("//span[@id='ctl00_ContentPlaceHolder1_lblEvents']/div[@class='contentmain']");
foreach ($Event_List as $Event) {
}
But here's where I'm stuck.
What I need to do now is foreach of the $Events - fetch all the 'textArea1' divs and grab all of the data inside that div. (The data within the <strong> tags, <a> tags etc inside the div.)
Please reply if you'd like more info.
If you could provide any help what-so-ever, it'll be much appreciated.
Thanks.