Hey,
I'm writing an XML parser in C++. Currently it works, but too much of what needs to be done is left up to the end user. I'm trying to figure out a way to have a clean, more encapsulated interface for the parser, but I can't seem to think of one that I like.
This is the deceleration of the parser:
class XMLParser
{
public:
XMLParser();
~XMLParser();
int OpenFile(std::string filename);
void CloseFile();
int ReadTag(std::string* name,TagType* tag_type,bool* attributes);
int ReadAtribute(std::string* name,std::string* value,TagType* tag_type,bool* moreAttributes);
int ReadData(std::string* data);
private:
std::string m_filename;
std::ifstream m_fin;
};
Definition of TagType:
enum TagType
{
Unknown,
Open,
Close,
StandAlone
};
The parser works with three main methods. The first is ReadTag(). It returns 1 on success, and fills name with the name of the tag, tag_type with the type of the tag (if not known, its set to 0 (unknown)), and fill attributes with whether or not there are any attributes to be read (if attributes is true, tag_type is always unknown).
ReadAttribute() should only be called directly after a call to ReadTag() that sets attributes to true (or a call to ReadAttribute() that says there are more attributes). It will give you the name of the attribute, it's value, the type of tag (again, if known), and whether or not there are more attributes (again, if true, tag_type is always unknown).
ReadData() should be called only after the attribute flag (of either ReadTag() or ReadAttribute()) is false. It fills the string with data from the file until it encounters the beginning of another tag, at which point you should call ReadTag() and start the cycle over. Here is an example of how these methods may be used:
#include <iostream>
#include <string>
#include "ReznebXML.h"
using namespace std;
int main()
{
cout << "XML test" << "\n\n";
XMLParser* parse;
parse = new XMLParser;
parse->OpenFile("test.xml");
string str1;
string str2;
bool attributes;
TagType type;
string root;
parse->ReadTag(&str1,&type,&attributes);//parse root element...
root = str1;//...and store it
cout << "Root: " << root << "\n";
while(attributes)//while there are still attributes to be read...
{
if(!parse->ReadAtribute(&str1,&str2,&type,&attributes))
return 0;
cout << "\t" << str1 << ": " << str2 << "\n";//...output them
}
if(type != Open)
return 0;
if(!parse->ReadData(&str1))
return 0;
cout << "Data: " << str1 << "\n";
str1 = "";
//Read elements
while(true)
{
if(!parse->ReadTag(&str1,&type,&attributes))
return 0;
if(str1 == root)
{
if(!attributes && type == Close)
break;
else
return 0;
}
else
{
cout << "Sub element: " << str1 << "\n";
while(attributes)
{
if(!parse->ReadAtribute(&str1,&str2,&type,&attributes))
return 0;
cout << "\t" << str1 << ": " << str2 << "\n";
}
if(!parse->ReadData(&str1))
return 0;
cout << "Data : " << str1 << "\n";
}
}
system("PAUSE");
return 0;
}
As you can see, there's a lot of user-dependency (the user being the programmer that uses the parser). Can anyone help me think of a better interface, probably one that wraps around these three methods.
Thanks.