I am working on a project that will read in a huge file and insert the data into a SQL database. I can read the document, but all of the XML is incorrect. I get it as an excel spreadsheet, but it's actually about 6 webpages stuck into one document, and it opens as a webpage in IE. Getting the document formatted differently is not an option.

Here's the problem. The XML I'm getting looks like this

<TD class=xl101>001101550</TD>

and C#'s Xml.Reader is expecting

<TD class="xl101">001101550</TD>

with the quotes around the style name and I keep getting an Unexpected Token error. Does anyone know how tell the reader to ignore the invalid style?

Hi,


I assume that when you say "getting the docuement formatted differently is not an option" you mean that you don't want to change the original document. So here is my idea

1.Get the XML document as a 'string' with StreamReader.ReadToEnd .

2.Create a method than inserts the quotes and return the result as a 'string'.

3.Use the XmlReader against the 'string' returned by the method (2) like this XmlReader xmlR = new XmlTextReader(the method here); .

Hope this help,
Camilo

Actually what I meant by "getting the file formatted differently" was that what I'm getting is what I have to work with. Editing the original file is what I've been working on so that everything will be formatted. I just meant that what I'm getting is what I have to work with. I can't request that the file is sent differently.

Really what I wanted to know is if there was a way to just ignore the formatting all together. It's not necessary for what I'm going. The data is never going to be put back into anything that would need the formatting or styles or anything. I just have to pull the data out of the file and store it in a SQLDB.

As it looks right now though I think I'm just going to have to reformat the file before I do anything with it, or just skip using the XML reader stuff and do that a different way.

Bondo, I'm sorry but there is no way to read not well-formed XML data in .NET. Since you want to use this XML Data into a SQL Server, it's more important for it to be well formed.

You can create you own class so that it can ignore the "" of the attributes, or you can easily change it using the find&replace.

Blast! That's what I was afraid of. I started doing that anyways, but I was trying to keep hope alive for an easier way! Oh well.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.