Hello Everyone
I am having some trouble in parsing a XML document with a perl script.
I have a file like the attached file(I have just taken a part of the original file
as it is too big to be posted overhere and is hard to analyze manually).
Now, what I have to do is count the number of different authors publishing a paper
in a single year.
To count this, I can either use the issue printdate or the cpyrtdate.
The trouble is that I am not able to pass a value in this line:
my $nodeset = $xp->findnodes('//issue[@printdate="1917-07-00"]/..//author');
my $nodeset = $xp->findnodes('//issue[@printdate="$x"]/..//author');
where $x comes from a list containing all the years like 1917, 1913 etc.
I am using this code but it is not helping much .
use XML::XPath;
my $file = 'Aj.xml';
my $xp = XML::XPath->new(filename=>$file);
my $nodeset = $xp->findnodes('//issue[@printdate="1917-07-00"]/..//author');
my @date;
if (my @nodelist = $nodeset->get_nodelist)
{
@date = map($_->string_value, @nodelist);
@date = sort(@date);
local $" = "\n";
print "I found these authors:\n@date\n";
}
I have analyzed the file manually and the things to be considered are as follows:
issue printdate="1913-01-00"
Author names:
DavidW.Cornelius.
FrederickSlate
issue printdate="1913-02-00"
Author names:
DavidW.Cornelius.
LachlanGilchrist
issue printdate="1917-08-00"
Author names:
H.W.Nichols.
issue printdate="1917-07-00"
Author names:
JohnZeleny
So, what I want should like this:
Year No. of different authors publishing in a single year
1913 3
1917 2
I am kind of stuck with it, can somebody please help.
Thanks
Aj