DataSet Serialization: Smaller & Faster

DataSets serialize naturally to XML quite well and you have lots of control over that. Typed DataSets have the XSD with some properties that control that (and of course you can do it programmatically too). But one of the common problems with remoting DataSets is that the default binary serialization is actually just the XML Serialization ASCII. Crude. Some people have even used this fact to extrapolate that the DataSet is internally XML - which isn't true.

This is improved in Whidbey. But until then, what's a Serializer to do?

Lots of people have done some great work to do customized binary serialization of datasets. To mention a few:

  • Microsoft Knowledge Base Article 829740 by Ravinder Vuppula where he demonstrates his DataSetSurrogate class which wraps up a DataSet and converts the internal items into array lists which can then be Binary Serialized by the framework by default. It also contains a ConvertToDataSet so that you can reverse the process of a De-serialized surrogate back into a dataset.
  • Dino Esposito has demonstrates a GhostSerializer in his MSDN Article for a DataTable which does a similar ArrayList conversion thing.
  • Richard Lowe's Fast Binary Serialization
  • Dominic Cooney's Pickle
  • Angelo Scotto's CompactFormatter goes a step further with a serializable technique that doesn't rely on the Binary Formatter so it works on the Compact Framework which is event more Compact than the BinaryFormatter
  • Peter Bromberg builds on top of the CompactFormatter to support compression using Mike Krueger's ICSharpCode SharpZiplib in-memory zip libraries