TechEd (Day -1): Pieces of ADO.NET 2.0 not going to make Whidbey

Highlights in this exchange of newsgroup posts...

  • Paging is cut
  • Server cursors cut (SqlResultSet class)
  • Async Open is cut, but not excute
  • SqlDataTable class cut (Good - that was a bad design pattern to propogate)
  • Command sets are cut, but you can still do batch DataAdapter updates,

More to come, stay tuned.

NUnit 2.2 beta released

Things I like:

  • Assert.AreEquals support to compare arrays of the same length, type and values.
  • You can now put a Category attribute on your fixtures AND methods....and then use that as a filter when you go to run tests. Thoughts on categories? Functional Tests,  Performance Tests
  • On a similar note, there is an explicit attribute that will cause a fixture or method not to run unless explicitly selected by the user. You can now put check boxes on the tree to select multiple fixtures/methods.
  • They fixed a problem with background threads that when they raise exceptions, they weren't showing up as a problem in NUnit. Seems they've done some refactoring of how things are loaded in the AppDomain. I'm hopeful that this fixes some issues I've seen when own dynamic loading and Fusion get's lost...but only during the NUnit tests, not the production execution.

Looks exciting.

Delegation through NUnit Testing and TodoFixtures

Usually I'm the guy who all the other developers are wiating on to create some reusable framework widget or other. I usually have 10, 000 things on my plate so when somebody asks me to do something or reports a bug with some of my code, I need to find a good way to delegate.

But if you are the subject matter expert (SME), it's tough to delegate the task of making the fix or adding the feature. If you flip that on it's head though, when you find yourself in this situation, by definition you are NOT the SME of the “feature request” or “bug“. Those are external to the actual implementation which is probably what you are really the SME for. The SME for the request or bug, is of course, the finder of the bug or the requestor of the feature. So in the spirit of getting the right person for the job (and delegating a bit at the same time), get the requestor to create the NUnit test that demonstrates the bug or explains (with code - better than english can ever hope to) the request or how the requestor expects the feature to be consumed.

Case in point:

Random Developer A: Barry, there is a bug in the foobar you created. Can you fix it? I know you are busy, want me to give it a go? Do you think it's something I could fix quickly?

Barry: That's a tricky bit but I think I know exactly what I need to do to fix it, it will take me 10 minutes - but I have 3 days of 10 minute items in front of you. Why don't you create an NUnit test for me that demonstrates the bug, and  I'll fix it. Then it will only take me 2 minutes.

I also find NUnit tests a great way for people to give me todo items.

Random Developer B: Hey, I need a method added to FoobarHelper that will turn an apple into an orange, unless you pass it a granny smith, in which case it should turn it into a pickle.

Barry: Sounds reasonable. I can do that - but just to make sure I got all of that spec correct, would you mind creating an NUnit test that demonstrates that functionality you require? Thanks.

I do have to admit though that this requires a certain amount of charisma. On more than one occassion this technique has been met with some friction and unusual and jestures and mumbling. :)

New Smart Client Reference Application - IssueVision

This is a new smart client reference application from Microsoft. Actually it was created by Vertigo for Microsoft - where Susan Warren now works (former Queen of ASP.NET). This is not a rewrite of TaskVision which is a common question. It was built to show off some advanced topics for Smart Client apps in conjunction with the recent DevDays events that have been going on in the U.S. but unfortunately haven't made it up to Canada due to some overloaded efforts going into VS Live.

You can download this from Microsoft although it's not the easiest thing to find.

Some of the interesting highlights:

  • focus on security....some wrapped up DPAPI classes.
  • Application Deployment and Updating

This app wasn't built with the recently released offline application block since the timing wasn't right - but nevertheless, a good fresh reference app worth looking at.

Building Maintainable Applications with Logging and Instrumentation

I'm doing this MSDN webcast in a few weeks

10/05/2004 1:00 PM - 10/05/2004 2:00 PM (EasternTime)

In this session we'll cover the world of logging and instrumenting your application. We'll discuss the various .NET framework components as well as higher level services as provided by the Exception Management Application Block, the Enterprise Instrumentation Framework and the Logging BLock. We'll discuss the various issues with persisting information in file logs, the event log, and WMI Performance Counters. We will also compare other alternative technologies such as log4net. We'll also discuss best practices for loging and instrumenting your application and provide some considerations for when and where it makes good sense to instrument your application from experiences in the field.

Update: The slides, samples and livemeeting recording links can all be found here.

Database Access Layers and Testing

I've been doing a lot of testing lately. A lot. I'm building a database agnostic data access layer. It has to be as performant as using typed providers, and even support the optional use of a typed provider. For example, we want to allow developers to build database agnostic data access components (dac). Sometimes however, there is just too much difference and to write standard Sql requires too much of a sacrifice. So in these cases, we want to allow a developer to write a high performance dac for each of the dac's, giving them ultimate control to tweak the data access for one database or another - and use a factory at runtime to instantiate the correct one. Of course they have to implement the same interface so that the business components can talk to an abstract dac. So normally developers can talk to their database through the agnostic DbHelper, but when they want, they can drop down to SqlHelper or OracleHelper.

We also want to support a rich design time environment. Creating DataAdapters and SqlCommands in C# isn't fun. Good developers botch these up too easily - not creating parameters right - or worst not using parameters at all and opening themselves up to sql injection. The typed DbCommand and DbDataAdapter's allow for a rich design time painting of sql and generation of parameters when used on a sub-class of System.Component. Of course, developers aren't stuck with design time - they are free to drop into the code when they want to.

In building this data access layer, I've being doing deep research and testing on a lot of different data access blocks - including ones I've authored in the past. I've taken stuff from examples like PetShop and ShadowFax and of course looked at the PAG groups Data Access Block, including the latest revision. I'm very happy with where the code stands today.

One of the features missing from all of these is a streaming data access component. In some cases we have a need to start writing to the response stream of a web service before the database access is complete. So I've tackled this problem using an event model which works nicely. You can put “delegate“ or “callback“ labels on that technique too and they'll stick.

One of the interesting “tricks” was to use the SqlProvider as an agnostic provider during design time. We wanted to allow developers to use design time support and so I went down the path of creating my own agnostic implementors of IDataAdapter, IDbConnection, IDbCommand, etc. etc. The idea was that at runtime, we'd marshal these classes into the type specific provider objects based on the configuration. I was about 4 hours into this exercise when I realized I was pretty much rewriting SqlCommand, SqlDataAdapter, SqlConnection, etc. etc. What would stop me from using the Sql provider objects as my agnostic objects? At runtime, if my provider is configured for Oracle, I use the OracleHelper's “CreateTyped“ commands to marshal the various objects into Oracle objects, of course talking to them through the Interface. As a shortcut, if my provider is configured for Sql at runtime, I just use the objects as they are.

The neat fall out feature from this is that you can write an entire application against SqlServer, using SqlHelper if you like, and if you are thrown a curve ball to use Oracle, the only managed code changes are to change your reference from SqlHelper to DbHelper and everything else all works. Mileage will of course vary depending on how many times you used Sql'y things like “Top 1“. Just as importantly however, developers using this data block learn only 1 type of data access and the same technique applies to all other databases.

One of the sad things is how this thing evolved. In the beginning there was bits of no less than 3 data access blocks in this class library. I spun my wheels quite a bit going back and forth and doing a lot of prototyping under some heat from developers who need to start writing dac's. Because I was starting from chunks of existing code, somehow NUnit tests didn't magically happen. So I've spent the past few days working to a goal of 90% coverage of the code by NUnit tests. It's tough starting from scratch. Not only have I been finding & fixing lots of bugs, you won't be surprised that my testing has inspired the odd design change. I'm really glad I got the time to put the effort on the NUnit tests because sooner or later those design changes would have been desired by developers using this block - and by that time - the code would have been to brittle to change. Certainly some of my efforts would have been greatly reduced had I made these design changes sooner in the process. I'm not new to the fact that catching bugs earlier makes them cheaper to fix - but nothing like getting it pounded home. I'll be pretty hesitant to take on another exercise of editing some existing code without first slapping on a bunch of NUnit tests that codify my expectations of what the code does.


Downtown Metro Toronto .NET UG Inaugural Meeting!

Finally a downtown user group.  First week of every month - and the first one is April 1st - no 200 Bloor St. East (Manulife) at Jarvis. This is also the first date on the MSDN Canada .NET User Group Tour across Canada. There is also a raffle for an XBox.

The sad news is that this meeting is going to get cut off at the first 200 people - so register soon by sending an email to

speaker: Adam Gallant
location: Manulife Financial Building 1st Floor 200 Bloor Street East Toronto

Better Web Development

In this session, we will focus on some fundamentals in web development, including a special drill-down on security and caching. We will cover an overview of the .NET security, and specifically important aspects in ASP.NET security and best practices. We will also cover, at a high-level, the caching mechanisms used by ASP.NET.

Debunking Dataset Myth

Many people think that datasets are stored internally as XML. What most people need to know is that Datasets are serialized as XML (even when done binary) but that doesn't mean they are stored as XML internally - although we have no easy way of knowing, it's easy to take a look at the memory footprint of datasets compared to XmlDocuments.

I know that if datasets were stored as XML, then in theory, datasets should be larger since BeginLoadData/EndLoadData implies there are internal indexes maintained along with the data.

It's not easy to get the size of an object in memory, but here is my attempt.

long bytecount = System.GC.GetTotalMemory(true);
DataSet1 ds =
new DataSet1();
ds.EnforceConstraints =
bytecount = System.GC.GetTotalMemory(
true) - bytecount;
MessageBox.Show("Loaded - Waiting. Total K = " + (bytecount/1024).ToString());

long bytecount = System.GC.GetTotalMemory(true);
System.Xml.XmlDocument xmlDoc =
new System.Xml.XmlDocument();
bytecount = System.GC.GetTotalMemory(
true) - bytecount;
MessageBox.Show("Loaded - Waiting. Total K = " + (bytecount/1024).ToString());

I tried these examples with two different xml files - both storing orders & orderdetails out of the northwind database. The first example was the entire result set of both tables. The dataset memory size was approximately 607K. The XmlDocument was 1894K, over 3 times larger. On a second test, I used only 1 record in both the order and order details tables. The dataset in this case took 24K and the XmlDocument took 26K, a small difference.  You will notice that in my dataset example I have turned off index maintenance on the dataset by using BeginLoadData. Taking this code out resulted in a dataset of 669K, an increase of approximately 10%. An interesting note is that if you put in a BeginLoadData and EndLoadData, the net size of the dataset is only 661K. This would imply that leaving index maintenance on during loads is inefficient in memory usage.

The speed of loading from XML is a different story.  Because the XmlDocument delays (I'm assuming) the parsing of the XmlDocument, the time to load of the full dataset from an XML file is 1/3rd of the time to load the DataSet from XML. I would be careful in being too concerned about this. Loading a dataset from a relational source like a DataAdapter that involves no Xml parsing and is much faster.

If you load up Anakrino and take a look at how the Dataset stores it's data, each DataTable has a collection of columns, and each column is in fact a strongly type storage array. Each type of storage array has an appropriate private member array of the underlying value type (integer, string, etc.). The storage array also maintains a bit array that is used to keep track of which rows for that array are null. The bit array is always checked first before going to the typed storage array and returns either null or the default value. That's pretty tight.

The GAC Exposed

So you want to see what's in the Gac. Of course if you go to c:\windows\assembly in your explorer - you see a customized shell extension of the global assembly cache. If you want to see the actual files underneath, in the past I've always gone to the command prompt and dir myself into long file name oblivion.

To get rid of that shell extension, just add a new DisableCacheViewer registry entry (type DWORD) underneath the key HKLM\Software\Microsoft\Fusion and set the value to 1 and presto - it's gone. C:\windows\assembly has never looked so good. Of course, don't do this on your end users' machine as this is really just developer requirement to help figure out what's going on and what's really in the GAC.

If that doesn't help you debug your assembling binding problems - don't forget about FUSLOGVW.exe. But that's another blog entry for another day.

Datasets vs. Custom Entities

So you want to build your own entity objects? Maybe you are even purchasing or authoring a code-gen tool to do it for you. I like to use Datasets when possible and people ask why I like them so much. To be fair, I'll write a list of reasons to not use datasets and create your own entities - but for now, this post is all about the pros of datasets. I've been on a two week sales pitch for DataSets with a client so let me summarize.

  • They are very bindable.
    This is less of an issue for Web forms which don't support 2 way databinding. But for Win forms, datasets are a no brainer. Before you go and say that custom classes are just as bindable and could be, go try an example of implementing IListSource, IList, IBindingList and IEditableObject. Yes you can make your own custom class just as bindable if you want to work at it.
  • Easy persistence.
    This is a huge one. Firstly, the DataAdapter is almost as important as the DataSet itself. You have full control over the Select, Insert, Update and Delete sql and can use procs if you like. There are flavours for each database. There is a mappings collection that can isolate you from changes in names in your database. But that's not all that is required for persistence. What about optimistic concurrency? The DataSet takes care of remembering the original values of columns so you can use that information in your where clause to look for the record in the same state as when you retrieved it. But wait, there's more. Keeping track of the Row State so you know whether you have to issue deletes, inserts, or updates against that data. These are all things that you'd likely have to do in your own custom class.
  • They are sortable.
    The DataView makes sorting DataTables very easy.
  • They are filterable.
    DataView to the rescue here as well. In addition to filtering on column value conditions - you can also filter on row states.
  • Strongly Typed Datasets defined by XSD's.
    Your own custom classes would probably be strongly typed too...but would they be code generated out of an XSD file? I've seen some strongly typed collection generators that use an XML file but that's not really the right type of document to define schema with.
  • Excellent XML integration.
    DataSets provide built in XML Serialization with the ReadXml and WriteXml methods. Not surprising, the XML conforms to the schema defined by the XSD file (if we are talking about a strongly typed dataset). You can also stipulate whether columns should be attributes or elements and whether related tables should be nested or not. This all becomes really nice when you start integrating with 3rd party (or 1st party) tools such as BizTalk or InfoPath. And finally, you can of course return a DataSet from a Web Service and the data is serialized with XML automatically.
  • Computed Columns
    You can add your own columns to a DataTable that are computed based on other values. This can even be a lookup on another DataTable or an aggregate of a child table.
  • Relations
    Speaking of child tables, yes, you can have complex DataSets with multiple tables in a master detail hierarchy. This is pretty helpful in a number of ways. Both programmatically and visually through binding, you can navigate the relationship from a single record in master table to a collection of child rows related to that parent. You can also enforce the the referential integrity between the two without having to run to the database. You can also insert rows into the child based on the context of the parent record so that the primary key is migrated down into the foreign key columns of the child automatically.
  • Data Validation
    DataSets help with this although it's not typically thought of as an important feature. It is though. Simple validations can be done by the DataSet itself. Some simple checks include: Data Type, Not Null, Max Length, Referential Integrity, Uniqueness. The DataSet also provides an event model for column changing and row changing (adding & deleting) so you can trap these events and prevent data from getting into the DataSet programmatically. Finally with the SetRowError and SetColumnError you can mark elements in the DataSet with an error condition that is can be queried or shown through binding with the ErrorProvider. You can do this to your own custom entities with implementation of the IDataErrorInfo interface.
  • AutoIncrementing values
    Useful for columns mapped to identity columns or otherwise sequential values.

This is not an exhaustive list but I'm already exhausted. In a future post, I'll make a case for custom entities and not DataSets, but I can tell you right now that it will be a smaller list.