Hello guys!
At school we were given a homework to read some info from web page and display it. While surfing over net for some infos how to approach to this a came across at HTML Agility Pack. I decided to use it. But I have some problems with parsing the content.
Here is the part of http that I have to get values from (marked with red).
<h3>marec - maj 2009</h3>
<div class="graf_table">
<table summary="layout table">
<tr>
<th>DATUM</th>
<td class="datum">10.03.2009</td>
<td class="datum">24.03.2009</td>
<td class="datum">07.04.2009</td>
<td class="datum">21.04.2009</td>
<td class="datum">05.05.2009</td>
<td class="datum">06.05.2009</td>
</tr>
<tr>
<th>Maloprodajna cena [EUR/L]</th>
<td>0,96000</td>
<td>0,97000</td>
<td>0,99600</td>
<td>1,00800</td>
<td>1,00800</td>
<td>1,01000</td>
</tr>
<tr>
<th>Maloprodajna cena [SIT/L]</th>
<td>230,054</td>
<td>232,451</td>
<td>238,681</td>
<td>241,557</td>
<td>241,557</td>
<td>242,036</td>
</tr>
<tr>
<th>Prodajna cena brez dajatev</th>
<td>0,33795</td>
<td>0,34628</td>
<td>0,36795</td>
<td>0,37795</td>
<td>0,37795</td>
<td>0,37962</td>
</tr>
<tr>
<th>Trošarina</th>
<td>0,46205</td>
<td>0,46205</td>
<td>0,46205</td>
<td>0,46205</td>
<td>0,46205</td>
<td>0,46205</td>
</tr>
<tr>
<th>DDV</th>
<td>0,16000</td>
<td>0,16167</td>
<td>0,16600</td>
<td>0,16800</td>
<td>0,16800</td>
<td>0,16833</td>
</tr>
</table>
</div>
So far I managed to write this, which gives me all values from table.
So my question is, waht to add/change in query so that it'll return me only values form cells where table header is DATUM and Maloprodajna cena [EUR/L]?
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(@"C:\Users\User\Desktop\petrol.celota.htm");
doc.OptionCheckSyntax = true;
doc.OptionFixNestedTags = true;
doc.OptionAutoCloseOnEnd = true;
doc.OptionOutputAsXml = true;
doc.OptionDefaultStreamEncoding = Encoding.Default;
var query = from html in doc.DocumentNode.SelectNodes("//div[@class='graf_table']").Cast<HtmlNode>()
from table in html.SelectNodes("//table").Cast<HtmlNode>()
from row in table.SelectNodes("tr").Cast<HtmlNode>()
from cell in row.SelectNodes("th|td").Cast<HtmlNode>()
select new { Table = table.Id, CellText = cell.InnerHtml };