So I downloaded the Adventureworks 2012 SQL database to browse hoping to find something interesting and just nearly fell asleep looking at example data when I came across something new or that I at least never noticed before...a field with a data type of "xml(CONTENT Person.AdditionalContactInfoSchemaCollection)" and when viewing the example data I see things like:

<AdditionalContactInfo xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactInfo" xmlns:crm="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactRecord" xmlns:act="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactTypes">
  <act:telephoneNumber>
    <act:number>425-555-1112</act:number>
    <act:SpecialInstructions>Call only after 5:00 p.m.</act:SpecialInstructions>
  </act:telephoneNumber>Note that the customer has a secondary home address.<act:homePostalAddress><act:Street>123 Oak</act:Street><act:City>Seattle</act:City><act:StateProvince>WA</act:StateProvince><act:PostalCode>98001</act:PostalCode><act:CountryRegion>USA</act:CountryRegion><act:SpecialInstructions>If correspondence to the primary address fails, try this one.</act:SpecialInstructions></act:homePostalAddress>Customer provided additional email address.<act:eMail><act:eMailAddress>customer1@sample.com</act:eMailAddress><act:SpecialInstructions>For urgent issues, do not send e-mail. Instead use this emergency contact phone<act:telephoneNumber><act:number>425-555-1111</act:number></act:telephoneNumber>.</act:SpecialInstructions></act:eMail><crm:ContactRecord date="2001-06-02Z">This customer is interested in purchasing high-end bicycles for his family. The customer contacted Michael in sales.</crm:ContactRecord></AdditionalContactInfo>

And it got me thinking...is Microsoft advocating using xml tags in database fields instead of setting up 1-n relationships?

The example above could be split into a 1-n relationship on all of the xml tags, phone numbers, special instructions, address information, and other non-specific comments.

I hear the nay-sayers and xml lovers and abusers now saying that splitting such information up among many different tables creates a tangled web of relationships and dependancies which can be a nightmare to follow; and I will concede to that to a point, but coming from a database with probably over 50 relationships already, why opt for a custom data type that:

  1. bloats the field size to accomodate xml markup,
  2. requires additional parsing by the application receiving the data,
  3. clogs network wires with extra data that when the application is finished parsing may not even be necessary,
  4. makes the data nearly impossible (unless I am mistaken) to filter in select statements and,
  5. requires the dba and future dbas to understand the data type that was defined.

Am I missing something here or is this clearly a terrible use of xml markup (as if there even is a good use for xml markup...)? I would think tables named TelephoneNumbers, SpecialInstructions, and Addresses with a 1-n relationship to the Person table would be MUCH easier to understand and follow than an obscure, poorly named (AdditionalContactInfo) field is.

Does anyone know why a dba would do something like this?

I'm really curious to see if anyone would do this in reality and if so, why?

According to Microsoft:

Following are some of the reasons to use native XML features in SQL Server instead of managing your XML data in the file system:

You want to share, query, and modify your XML data in an efficient and transacted way. Fine-grained data access is important to your application. For example, you may want to extract some of the sections within an XML document, or you may want to insert a new section without replacing your whole document.

You have relational data and XML data and you want interoperability between both relational and XML data within your application.

You need language support for query and data modification for cross-domain applications.

You want the server to guarantee that the data is well formed and also optionally validate your data according to XML schemas.

You want indexing of XML data for efficient query processing and good scalability, and the use of a first-rate query optimizer.

You want SOAP, ADO.NET, and OLE DB access to XML data.

You want to use administrative functionality of the database server for managing your XML data. For example, this would be backup, recovery, and replication.

If none of these conditions is satisfied, it may be better to store your data as a non-XML, large object type, such as [n]varchar(max) or varbinary(max).

Interesting.

But I'm still not convinced, maybe it's because I am on the side of "XML is the child of evil programmers and data engineers bent on world domination and must be destroyed."

I just find it rather amusing that anyone would have non-relational (XML) data in a relational (SQL) database...obviously in this example the data in the XML field IS related to the specific row, and not only that, since some XML tags appear more than once, is a perfect candidate for relational data and 1-n tables...am I wrong in thinking that?

You are not wrong in thinking that. The only advantage I see with the XML is that you can change it rapidly without requiring new database tables, procedures, etc. and it won't break old (properly written) code that will just ignore the new XML.

I suppose so. It still irks me to see xml in a SQL table...in my head, it just doesn't belong! But since I don't use xml, I guess it's harder for me to see through that.

Thanks for trying to explain that...while it still makes no sense to me, I guess I'm willing to accept that it does to someone else. :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.