Moving DataSets across Web Service Boundaries

A situation last week had me looking deep into the bowels of how DataSets are marshaled across web service methods.  The situation is as follows.

For programming ease, the typed DataSet that was automatically generated was modified to include a custom object as one of the column types.  If you've ever taken a close look at the code that gets generated when a DataSet is created from an XSD diagram, you'll realize that this is not a difficult thing to accomplish.  And, for the most part, there is little reason to modify the generated code. But that being said, there are times when doing so can make the life of the DataSet user a little easier.  And this isn't really the point of the blog, only the motivation for digging into DataSet marshaling.

The modified DataSet was being returned to the client by a web service method.  When the hacked column on the returned object was accessed on the client side, a StrongTypingException is thrown.  Another look at the DataSet generated code shows that the cause of this exception is a casting error in the property.  Further examination showed that the name of the class associated with the column was being returned.  As a string.  Naturally when the literal “This.Namespace.Class” is converted to an object of type This.Namespace.Class, casting exceptions are the result.

This behavior begged the question “why wasn't the object being XML serialized with the rest of the DataSet”.  After all, though I haven't mentioned it yet, This.Namespace.Class is completely serializable.  In particular, why was the name of the class being returned???  That didn't make any sense.

At this point, I broke out .NET Reflector and took at look at the details of the DataSet.  First of all, if a DataSet is returned across a web service boundary, it is not XML serialized in the same manner as other objects.  Instead of converting the contents of the DataSet to XML, a diffgram of the DataSet is generated.  More accurately, the WriteXml method is called with a WriteMode of Diffgram. Within the DataSet class, there is a method called GenerateDiffgram.  This method walks across each of the tables in the DataSet, followed by walks, in turn, through the DataTables and DataRows.  When it gets to an individual column, it loads up a DataStorage object.  More accurately, since DataStorage is an abstract class, it instantiates a DataStorage derived class, using the column type to pick the appropriate class.  A veritable DataStorage factory.  When the column type is one of the non-intrinsic types, the ObjectStorage class is used.  On the ObjectStorage class, the ObjectToXml method is called.  This is where the unexpected happens. At least, it was unexpected for me.  The ObjectToXml method does not XML Serialize the object!

In the case of the ObjectStorage class, what actually happens is that a check is performed to see if the object is a byte array.  If it is a base64 encoding of the array is returned.  Otherwise the ToString() method on the object is called. When it comes to re-hydration on the client side, a similar process occurs.  The difference is that the XmlToObject method in the ObjectStorage class instantiates the desired object by passing the string representation of the object (as emitted by the ObjectToXml method) into the constructor.

So what is the point of all this?  First, it explains why the name of the class was appearing in the generated XML.  Unless overridden, the output of ToString() for an arbitrary class is the name of the class.  It also explains why no object was being created on the client side, as the class I was working with didn't have the appropriate constructor.  My solution, which I freely admit is a bit of a hack, is to give the diffgram generation process what it's looking for.  I overloaded ToString() to return an XML document containing the values of the class, a poor man's version of the WriteXml method that is part of the IXmlSerializable interface.  I also created a constructor that took a string as a parameter and repopulated the properties of the class (the ReadXml portion of the process).  Problem solved.  But still, I have to wonder why ToString was used and not the IXmlSerializable interface methods in the first pace.  Here's hoping someone more knowledgeable than me will provide some insight.