Upcoming UG Meetings

Kate Gregory is starting a new UG in the east end in Oshawa. They are meeting Apr 20. The topic is an overview presentation of .NET. gtaeast.torontoug.net to register.

The regular meeting of the Canadian Technology Triangle meets next Tuesday Apr 20th as well. This special meeting is part of the MSDN User Group tour and the topic is The .NET Compact Framework. A special note that this event is not in its usual location, but rather at the Peter Benninger Theatre.  cttdnug.org to register.

Things I don't like about NUnit

Firstly - I love NUnit - and nobody has done more to increase the quality of .NET Applications than the contributors to this project - nobody.

But I know that next generation Unit testing framework authors are listening so I might as well state the things I'd like to see:

  • I'd like to be able to run a single test from the command line. Not just a single fixture, but a specific test.
  • I'd like my tests to be parameterizable. I'd like from the Command Line to run a test and provide the specific values.
  • Wouldn't it be cool to create a batch file of scenarios? What about doing this in XML? Duh - no brainer.
  • A lot of my tests in this case would simply wrap up and call business objects - so why not be able to have a virtual test in the xml file? That is just call a class/method directly with XML.I'm not saying this is the be all / end all - really the only kinds of assertions I could do realistically would be to test for certain exceptions or no exceptions. Return values? Possibly - but really - some times our methods accept and return things other than value types - but this might be a nice thing to have regardless.

With these things in place, one could conceivably:

  • tie a class modelling tool into test scenarios. Hmm, I need a class, with a method and if I pass this data, I should get these results.
  • If an end user reports a bug in my defect tracking system, I should be able to create an NUnit test that exposes this bug. Then my support people can come back from time to time, have the defect tracking system run the test to see if it's been fixed in a given patch/release/build, etc. and update the status on the defect.

 

 

Database Access Layers and Testing

I've been doing a lot of testing lately. A lot. I'm building a database agnostic data access layer. It has to be as performant as using typed providers, and even support the optional use of a typed provider. For example, we want to allow developers to build database agnostic data access components (dac). Sometimes however, there is just too much difference and to write standard Sql requires too much of a sacrifice. So in these cases, we want to allow a developer to write a high performance dac for each of the dac's, giving them ultimate control to tweak the data access for one database or another - and use a factory at runtime to instantiate the correct one. Of course they have to implement the same interface so that the business components can talk to an abstract dac. So normally developers can talk to their database through the agnostic DbHelper, but when they want, they can drop down to SqlHelper or OracleHelper.

We also want to support a rich design time environment. Creating DataAdapters and SqlCommands in C# isn't fun. Good developers botch these up too easily - not creating parameters right - or worst not using parameters at all and opening themselves up to sql injection. The typed DbCommand and DbDataAdapter's allow for a rich design time painting of sql and generation of parameters when used on a sub-class of System.Component. Of course, developers aren't stuck with design time - they are free to drop into the code when they want to.

In building this data access layer, I've being doing deep research and testing on a lot of different data access blocks - including ones I've authored in the past. I've taken stuff from examples like PetShop and ShadowFax and of course looked at the PAG groups Data Access Block, including the latest revision. I'm very happy with where the code stands today.

One of the features missing from all of these is a streaming data access component. In some cases we have a need to start writing to the response stream of a web service before the database access is complete. So I've tackled this problem using an event model which works nicely. You can put “delegate“ or “callback“ labels on that technique too and they'll stick.

One of the interesting “tricks” was to use the SqlProvider as an agnostic provider during design time. We wanted to allow developers to use design time support and so I went down the path of creating my own agnostic implementors of IDataAdapter, IDbConnection, IDbCommand, etc. etc. The idea was that at runtime, we'd marshal these classes into the type specific provider objects based on the configuration. I was about 4 hours into this exercise when I realized I was pretty much rewriting SqlCommand, SqlDataAdapter, SqlConnection, etc. etc. What would stop me from using the Sql provider objects as my agnostic objects? At runtime, if my provider is configured for Oracle, I use the OracleHelper's “CreateTyped“ commands to marshal the various objects into Oracle objects, of course talking to them through the Interface. As a shortcut, if my provider is configured for Sql at runtime, I just use the objects as they are.

The neat fall out feature from this is that you can write an entire application against SqlServer, using SqlHelper if you like, and if you are thrown a curve ball to use Oracle, the only managed code changes are to change your reference from SqlHelper to DbHelper and everything else all works. Mileage will of course vary depending on how many times you used Sql'y things like “Top 1“. Just as importantly however, developers using this data block learn only 1 type of data access and the same technique applies to all other databases.

One of the sad things is how this thing evolved. In the beginning there was bits of no less than 3 data access blocks in this class library. I spun my wheels quite a bit going back and forth and doing a lot of prototyping under some heat from developers who need to start writing dac's. Because I was starting from chunks of existing code, somehow NUnit tests didn't magically happen. So I've spent the past few days working to a goal of 90% coverage of the code by NUnit tests. It's tough starting from scratch. Not only have I been finding & fixing lots of bugs, you won't be surprised that my testing has inspired the odd design change. I'm really glad I got the time to put the effort on the NUnit tests because sooner or later those design changes would have been desired by developers using this block - and by that time - the code would have been to brittle to change. Certainly some of my efforts would have been greatly reduced had I made these design changes sooner in the process. I'm not new to the fact that catching bugs earlier makes them cheaper to fix - but nothing like getting it pounded home. I'll be pretty hesitant to take on another exercise of editing some existing code without first slapping on a bunch of NUnit tests that codify my expectations of what the code does.

 

ASP.NET Whidbey at CTTDNUG Tonight.

I'm presenting an overview on ASP.NET 2.0 tonight at CTTDNUG.

There isn't a great abstract on the site - and in fact, I will physically be unable to do the objectspaces stuff since the new version of VSNET CTP doesn't even have it in it anymore. Don't read into that - objectspaces will still be coming out - at some point. I should be able to give some nice objectspaces PPT's if the crowd is interested - but I'm guessing that Demo's are going to be more enjoyable.

So I am going to do my best ScottGu thrie impersonation and give a good solid demo lap around ASP.NET. IDE Improvements, Master Pages, the new datasource stuff, Site Navigation, Security, Personalization, SqlCaching.

Downtown Metro Toronto .NET UG Inaugural Meeting!

Finally a downtown user group.  First week of every month - and the first one is April 1st - no fooling..at 200 Bloor St. East (Manulife) at Jarvis. This is also the first date on the MSDN Canada .NET User Group Tour across Canada. There is also a raffle for an XBox.

The sad news is that this meeting is going to get cut off at the first 200 people - so register soon by sending an email to GrahamMarko@rogers.com.

http://www.metrotorontoug.com/

speaker: Adam Gallant
location: Manulife Financial Building 1st Floor 200 Bloor Street East Toronto

Better Web Development

In this session, we will focus on some fundamentals in web development, including a special drill-down on security and caching. We will cover an overview of the .NET security, and specifically important aspects in ASP.NET security and best practices. We will also cover, at a high-level, the caching mechanisms used by ASP.NET.

Security for Developers

Why is that you can't plug a fridge into your house until it's been CSA or FCC approved, and that you have to have a licensed electrician install or at least review any modifications to the wiring in your house plugged into the grid - but any yahoo can build a piece of software and install it on their home computer connected to the internet for the world to hack into? Before I make a case that developers should be forced to do some security training or pass some certification....we have to keep in mind that most of the time the software sitting on somebody's home computer that is getting hacked into is Microsoft's. This is largely due to the size of the huge target on their back. What you think Linux is really more secure? What do you think is easier to hack into? It's easier to hack into something when you have the source code.

So, having said all that, there are changes coming for Microsoft developers in the security space:

  • Microsoft is turfing a few of the existing security training offerings. These include the Microsoft Security Clinic (2800) and Security Seminar for Developers (2805).
  • There is a new security course being developed: Developing Secure Applications (2840) and also a MS Press Training Kit both of which I'll be reviewing during their development.
  • Related to the new course and training kit, there is a new security exam for developer which unfortunately because of timing is only an MCAD/MCSD/MCDBA elective (not a required element - sigh). There are 2 versions - 1 for VB and 1 for C#. I guess they figure C++ and J# developers already write secure code. These are going into beta at the end of next month and I'll be auditing the C# version.
    71-330 Implementing Security for Applications with Visual Basic .NET
    71-340 Implementing Security for Applications with Visual C# .NET

I'm a little torn over this direction. Part of me says that security is so important, it needs to be covered in every MS Training course. To a certain extent that is already true, but I think they could go deeper. When I teach a windows, web or services course, I try to go deep on security. Sometimes you can go to far. Some pieces of security are more relevant to the type of application you are building, while other security issues are common regardless of the application architecture. Obviously we don't want to repeat a lot of content in each course - sometimes that is unavoidable. The other issue is that there is a lot to know about security and frankly I don't think every developer can master all of this. So teams need to dedicate a security architecture role on their project. For these folks - then yes I think it makes sense to have specific and deep training and certification for them. I think MS could probably do better than a single exam “elective”. How about an MCSD.NET+Security designation? MCDBA+Security as well - although you could argue that MCDBA's should be forced to have this security. Perhaps that will happen in the wake of Yukon - although I've heard no rumblings of creating Whidbey or Yukon flavours of exams or certifications at this point.

Our industry and profession needs to take a leadership role and be proactive in accepting responsibility and accountability for the important issue of security. We need to move our discipline to a higher level. I'm not convinced it has to be government that steps up to this place. Governments should only do what we can't do for ourself. Microsoft seems to be taking an increasingly proactive role on these security issues. It will be interesting to see how this pays off in 2-3 years from now.

ADO.NET rant

Why is SqlDbType in the System.Data namespace when all the other provider specific types are in their own specific provider namespace? There is definitely some ugliness going on here. I'm not sure it's entirely a mistake.

As an aside, why is it SqlDbType and not SqlType? I can understand why OleDbType is named the way it is, but OracleType and OdbcType seemed to be named appropriately. Maybe it has something to do with the fact there is a System.Data.SqlTypes namespace and that would be just too close for comfort in the naming. Ok, so why isn't there an OdbcTypes namespace or any other types namespace for that matter? And shouldn't System.Data.SqlTypes be under System.Data.SqlClient.SqlTypes?

So what's the deal with IDataParameter and it's descendent IDbDataParameter? All of the typed provider parameter implementations implement both of these. Shouldn't they be collapsed into one? Even the IDbCommand.CreateParameter method has to return a IDbDataParameter. Furthermore, the online help for IDbDataParameter says it includes stuff for mapping to dataset columns. That's odd because the only 3 members of IDbDataParameter are Precision, Scale and Size. I'm pretty sure none of those having anything to do with Datasets. In fact, it's IDataParameter that provides this mapping in the SourceColumn member. Sheesh

Talk about your inconsistencies. It's still in WinFx from what I can tell. Please someone tell me there is a reason for this madness.

 

Debunking Dataset Myth

Many people think that datasets are stored internally as XML. What most people need to know is that Datasets are serialized as XML (even when done binary) but that doesn't mean they are stored as XML internally - although we have no easy way of knowing, it's easy to take a look at the memory footprint of datasets compared to XmlDocuments.

I know that if datasets were stored as XML, then in theory, datasets should be larger since BeginLoadData/EndLoadData implies there are internal indexes maintained along with the data.

It's not easy to get the size of an object in memory, but here is my attempt.

long bytecount = System.GC.GetTotalMemory(true);
DataSet1 ds =
new DataSet1();
ds.EnforceConstraints =
false;
ds.Order_Details.BeginLoadData();
ds.Orders.BeginLoadData();
ds.ReadXml("c:\\test.xml");
bytecount = System.GC.GetTotalMemory(
true) - bytecount;
MessageBox.Show("Loaded - Waiting. Total K = " + (bytecount/1024).ToString());

long bytecount = System.GC.GetTotalMemory(true);
System.Xml.XmlDocument xmlDoc =
new System.Xml.XmlDocument();
xmlDoc.Load("c:\\test.xml");
bytecount = System.GC.GetTotalMemory(
true) - bytecount;
MessageBox.Show("Loaded - Waiting. Total K = " + (bytecount/1024).ToString());

I tried these examples with two different xml files - both storing orders & orderdetails out of the northwind database. The first example was the entire result set of both tables. The dataset memory size was approximately 607K. The XmlDocument was 1894K, over 3 times larger. On a second test, I used only 1 record in both the order and order details tables. The dataset in this case took 24K and the XmlDocument took 26K, a small difference.  You will notice that in my dataset example I have turned off index maintenance on the dataset by using BeginLoadData. Taking this code out resulted in a dataset of 669K, an increase of approximately 10%. An interesting note is that if you put in a BeginLoadData and EndLoadData, the net size of the dataset is only 661K. This would imply that leaving index maintenance on during loads is inefficient in memory usage.

The speed of loading from XML is a different story.  Because the XmlDocument delays (I'm assuming) the parsing of the XmlDocument, the time to load of the full dataset from an XML file is 1/3rd of the time to load the DataSet from XML. I would be careful in being too concerned about this. Loading a dataset from a relational source like a DataAdapter that involves no Xml parsing and is much faster.

If you load up Anakrino and take a look at how the Dataset stores it's data, each DataTable has a collection of columns, and each column is in fact a strongly type storage array. Each type of storage array has an appropriate private member array of the underlying value type (integer, string, etc.). The storage array also maintains a bit array that is used to keep track of which rows for that array are null. The bit array is always checked first before going to the typed storage array and returns either null or the default value. That's pretty tight.

The GAC Exposed

So you want to see what's in the Gac. Of course if you go to c:\windows\assembly in your explorer - you see a customized shell extension of the global assembly cache. If you want to see the actual files underneath, in the past I've always gone to the command prompt and dir myself into long file name oblivion.

To get rid of that shell extension, just add a new DisableCacheViewer registry entry (type DWORD) underneath the key HKLM\Software\Microsoft\Fusion and set the value to 1 and presto - it's gone. C:\windows\assembly has never looked so good. Of course, don't do this on your end users' machine as this is really just developer requirement to help figure out what's going on and what's really in the GAC.

If that doesn't help you debug your assembling binding problems - don't forget about FUSLOGVW.exe. But that's another blog entry for another day.

Upstaged by Ballmer

So it would seem I'm upstaged by Steve Ballmer who is coming to town the same day as the CTTDNUG presentation I was making about Whidbey.  So in the interest of the greater good - my talk has been postponed until Mar 31.

The “Ballmer Developer Briefing” is mostly about Security...if that interests you?

What do you mean “IF” - of course that should matter to you. It should matter to everybody. Writing secure code isn't just about logging in you know. It's about keeping your code safe and more importantly your end users machines and data safe and not allowing your software to act as a gaping hole into their system or data be it through spoofing or SQL Injection Attacks. Writing code these days is more of a liability than it ever has and we all have to be responsible - so do yourself and the rest of the world a favour and brush up on your knowledge of security. Either that, or hire a good lawyer.

I'm so convinced this is a an important event (and sorry that I had to cancel my presentation) that ObjectSharp is co-sponsoring a bus for members of the CTTDNUG that will travel to Toronto from Kitchener and back. And for those of you no where near Kitchener? Did I mention that parking is free?

See you there.