I'm trying to understand XPath, but I've come acrost an issue I can not seam to find an answer for. In this case it seams that XPath is not returning what it should.
I've got a sample html file, test.html:
<html>
<div>
<p>1</p>
<p>2</p>
<p>3</p>
<p>4</p>
<p>5</p>
</div>
</html>
And my PHP file, test.php
<?php
echo "<pre>";
$url = "test.html";
$oldSetting = libxml_use_internal_errors(true);
libxml_clear_errors();
$html = new DOMDocument();
$html->loadHtmlFile($url);
$xpath = new DOMXPath($html);
$titles = $xpath->query("//p");
foreach ($titles as $title){
echo $title->nodeValue."<br />";
}
libxml_clear_errors();
libxml_use_internal_errors($oldSetting);
echo "</pre>";
?>
I can set the xpath query to //p and get all the p tags content on screen. That's good.
Set to /html//p I get the same. That's good.
Set //p[1] I get the first p tag. That's good.
Set to //p[5] I get the 5th p tag. That's good.
That's all groovy.
But if I do /html/div/p I get nothing. I've messed with a ton of similar queries with no luck.
I'm trying to read the url of an image from a website, and using Firefox's Firebug plugin I can copy the Xpath and I get something like
/html/body/div[2]/div/div[2]/div/div/div/div[2]/div/div/div[2]/p/img
But in PHP I get no result unless I remove all the "[2]", take out some of the div's and place a // before img.
So what's going on here, every example I've read says this is correct, but in the very very simple example above just a simple /html/div/p or /html/div//p does not work.
Thanks for your help!