hi,
using the sample xml below:
<hdr>
<sect role="para" label="1234">
<legref>asdf</legref>
<cite role="com">abc</cite>
<cite role="rg">RTY 08</cite>
<cite role="rg">SDF 05</cite>
<othertag>some textdata here
<cite role="rg">SXF 05</cite>
</othertag>
</sect>
<sect role="para" label="2345">
<cite role="com">xyz</cite>
<cite role="rg">WER 10</cite>
<cite role="rg">TRS 10</cite>
<cite role="rg">WER 10</cite>
<legref>qwert</legref>
</sect>
</hdr>
need to extract the label attribute value in <sect role="para" label="9999"> plus all its descendant <cite role="rg"> element.
<cite role="rg"> are always wrapped within <sect role="para" label="9999"> (i.e. sect element with role="para" attribute and label="9999" attribute. "9999" are based on paragraph numbers).
<cite role="rg"> can have more siblings, and can appear in lower levels of the xml tree but always within the wrapper <sect...> element.
can somebody please help me construct the xpath expression which should give a result that looks something like the one below:
<sect label="1234">
<cite role="rg">RTY 08</cite>
<cite role="rg">SDF 05</cite>
<cite role="rg">SXF 05</cite>
<sect label="2345">
<cite role="rg">WER 10</cite>
<cite role="rg">TRS 10</cite>
<cite role="rg">WER 10</cite>
somebody suggested the following css expression but 'SXF 05' in the example was missed because it appeared one level lower than the other cite elements.
p doc.css('cite[role = "rg"]').map { |x| [x.text, x.parent['label']] }
thanks in advance,
emmanuel