Help needed please. Desperate. I have a 3.somethingGB file that contains records that looks like this:
<tag>
<number>1</number>
<info>blah blah</info>
<more>Lorem Ipsum</more>
<anothertag>The quick brown fox...</anothertag>
<id>32444</id>
<yetmore>blah blah</yetmore>
</tag>
<tag>
<number>2</number>
<info>qwerty</info>
<more>yada yada qwerty</more>
<anothertag>yada yada qwerty</anothertag>
<id>32344</id>
<yetmore>yada yada qwerty</yetmore>
</tag>
<tag>
<number>3</number>
<info>yada yada qwerty</info>
<more>yada yada qwerty</more>
<anothertag>yada yada</anothertag>
<whatever>yada</whatever>
<id>32444</id>
<yetmore>yada yada</yetmore>
</tag>
I need to find the records that contain duplicate <id> tags. I was thinking reading all the <id> tags into a list and then trying to find the duplicates somehow and then iterating over the file again somehow and removing them that was but basically I'm not sure what to do. I'll start something and post back later but any help would be appreciated.