Data Driven Development

So we have Test Driven Development and Model Driven Development or Design by Contract (similar perspective). But in the past, I've been a fan of Data Driven Development. This is a technique I haven't had the pleasure of using recently....because it relies on you building new applications with new databases.

What is this technique you ask? Well for me it is designing the data model first. In the early days of Client/Server, PowerBuilder and ERwin were my tools of choice. New applications. New databases. My design process (and that of many of my associates) was not so much to design a database but to document the data that existed in the organization - and do that in 3rd normal form. ERwin still stands as one of the best modeling tools ever because it actually made the job of coding up a database schema easier and faster than any other alternative. I could also use my model throughout the entire lifecycle since it did an excellent job at full round trip engineering/synchronization.

One of the cool features of PowerBuilder was your ability to annotate your database schema with UI hints. So you could say that a given column in your database should by default be shown as a checkbox, and that checked should be saved as “true“ and unchecked as “false“ - or whatever weird thing your DBA said it had to store. Whenever you designed a screen with that column, bam you'd have it the way you'd expect - as a checkbox. The downside of PowerBuilder's datawindows of course was that the data store/entity/container was quite pretty tied to your database and they made no attempts to hide that fact. But boy, productivity was really high - although I was producing tightly coupled, loosely coupled code :( .NET let's me build better code now, but productivity is still lacking.

At TechEd a couple of weeks ago, I stopped by the DeKlarit booth for a demo of their product by their lead architect Andres Aguiar. I was happy to see a tool that builds upon the Data Driven Development process. Of course, you don't have to start with an empty database, but this tool does an excellent job of making your job easy when starting from scratch. Andres promised to send me an eval so I can play with it some more to see how it works with existing databases but this tool so stay tuned. I could easily see this tool paying for itself in a matter of a couple of weeks.

As for ERwin, I'm still a fan although it really hasn't changed much in the past 10 years. I remember the first copy I had fit on a single floppy. So did the 200 table model I created with it. I was using LBMS System Designer who stored my model in some kind of 10mb black hole and took 10 minutes to generate a schema. When I first installed ERwin, I had it installed and reverse engineered by LBMS model - and forward engineered to from Oracle to SqlServer inside of 10 minutes. I couldn't believe the schema generation took 20 seconds compared to LBMS at 10 minutes.

SOA Challenges: Entity Aggregation

Ramkumar Kothandaraman has a good article just released on MSDN discussing SOA Challenges: Entity Aggregation. Aggregation is a much better name than “composable entities“ since it's definition implies that property sets of an entity grow as more child entities are merged into it. This also implies that you need a mapping layer and conflict resolution to resolve duplicate property names or just rename them for that matter.

This is becoming an important technique for passing xml documents up the stack of web services, each one adding their own value to the entity - or aggregating in a master/slave hierarchy topology. Either way, one of the subtle things about entity aggregation is that you can also think of it as a lightweight form of multiple inheritance for the properties of your domain objects. Is that useful or am I just bent?

How do you feel about the VS.NET Query Designer

The VS Data Team wants your input. Head over here. (BTW, don't you love these surveys? They're the best and tell me that MS really cares about what we think).

Layered Design and DataSets for Dummies

Scott Hanselman does a nice 30 second intro into layered design. If any of this is new to you, run quickly to read this.

Scott does a quick bash at Datasets (although doesn't say why) and in my new role as DataSet boy I have to disagree with him and evangelize how simple datasets can make a lot of the code written by the typical programmer: CRUD stuff for example. He even mentions “Adapter“ in describing a data access layer - come on, use the DataAdapter - don't be afraid. In general, if anybody tells you to never do something, you need to question that a bit and dig into the reasons why a technology exists. Of course things may just end up being rude and the answer is indeed never - but always try and get the why.

We've been running a developer contest at the end of some of our training courses (the big 3 week immersion ones). The competition has developers build a solution build on a services oriented architecture which includes smart client, web services, enterprise services/com+ and of course data access. It's only a 1 day event, but the teams are built up of 5-6 people each. Inevitably, if one team decides to use datasets/dataadapters and the other team doesn't, the team that choose the dataset wins. This competition isn't judged or skewed for datasets by any means. But inevitably this is the thing that gives the other team more time to work on the interesting pieces of the application (like business logic: features and functions).

I over heard Harry Pierson tell a customer last week that they shouldn't use datasets in a web service because they aren't compatible with non .NET platforms. This isn't true. A dataset is just XML when you return it out of a dataset. And you probably more control over the format that is generated via the XSD than most people realize. If you want a child table nested, no problem. You want attributes instead of elements, no problem. You want some columns elements and others attributes, no problem. You want/don't want embedded schema, no problem. You don't want the diffgram, no problem. Somebody in the J2EE world has actually gone to the extent of creating a similar type of base object in Java that can deserialize the dataset in most of it's glory. (Link to come - can't find it right now).

In February I posted a “Benefits of Datasets vs. Custom Entities“ which has generated some excellent feedback. It's still in my plans to write the opposite article - when Customer Entities are better than Datasets but I'm still looking for the best template or example entity. Everyone somebody has sent me to date is somewhat lacking. To be fair, I end up always comparing them to a dataset. The things typically missing out of a custom entity are the ability to deal with Null values and the ability to track original values for the purposes of optimistic concurrency. The answer to the question of “When to use a Custom Entity over a Dataset?“ is of course when you don't need all the stuff provided for you by a dataset. So is that typically when you don't care about Null Values or Optimistic Concurrency? Perhaps. But I know there is more to it than that.

I will say there is some crummy binary serialization in the dataset (it's really XML). This is really a problem if you are doing some custom serialization or need to do some .NET remoting. But you can change the way they are serialized (and indeed it's changed in Whidbey). There are some good examples here, here, here, here, here and here.

I'm working on an article making the cases for the custom entity, but in the meantime, datasets represent a good design pattern for entities that is easy and quick to implement by the average developer - and scalable too.

TechEd (Day 3): Hands On Lab Manuals downloads available to the public

No need to have a TechEd commnet password. You can download ALL the pdf's for the plethora of topics. Some good stuff to see how the newly announced stuff (Team System, etc.) works.

Update These links are broken, give this a try:

TechEd (Day -1): ObjectSpaces = Longhorn, 3rd time charm?

It was bumped from The initial 1.0 release, and then as of last PDC slated for Whidbey. Now it looks like we'll have to wait until Longhorn.

It's not all bad news however. ObjectSpaces is being re-orged into the WinFs file system. When you think about there is an awful lot of correlation to those technologies. I'm sure it's not terribly unrelated to the fact that the Microsoft Business Framework(MBF) that was to build on ObjectSpaces was also pushed off to Longhorn/Orcas. MBF is also to rely on an orchestration engine (Biztalk light?) features going into Longhorn so it all makes sense.

Some people will be disappointed - but this is a good rationalization of the way too many data access/storage visions within Microsoft. Both of these technologies have a common thread about objects/applications and data and breaking down the wall. Sure, MS could have released ObjectSpaces first, but do we really need that legacy and all the effort attached to YADAA (yet another data access api.

Microsoft has taken a lot of criticism (including from me) about the seemingly constant churn of all things data. So this is a good sign that MS is not going to do things, just be cause they can, but do them right. Just ask a Java developer what they think of EJB's. It's important to get it right

TechEd (Day -1): Pieces of ADO.NET 2.0 not going to make Whidbey

Highlights in this exchange of newsgroup posts...

  • Paging is cut
  • Server cursors cut (SqlResultSet class)
  • Async Open is cut, but not excute
  • SqlDataTable class cut (Good - that was a bad design pattern to propogate)
  • Command sets are cut, but you can still do batch DataAdapter updates,

More to come, stay tuned.

Arrived in San Diego @ TechEd

One Uneventful pair of connecting flights, a car rental pickup and a check in at the hotel. It's the calm before the know, the calm associated with hotel internet access actually working. You know it's a computer conference when you can see 5 wireless access points from various people's hotel rooms, and I'm on the corner of the hotel!

I kind of drove in the back way and haven't been near the convention centre yet to check out the buzz, but I will tomorrow. The MCT day starts at 7:30am at the Marriott next door. I better get to bed.

There is a bit of a buzz on email right now - stuff I'm not allowed to talk about until Monday - but it's one of my speculations. More about that on Monday. There will be a lot of announcements on Monday. Stay tuned.

VS Live Party

We threw a party on Thursday night after VS Live Toronto to help blow off some steam. VS Live in Toronto was a good time. A few people agree.

  • Jay Roxe was one of the speakers and joined us for a night on the town.
  • Datagrid Girl Marcie Robillard too. Turns out we share some PowerBuilder history from back at her days with Anderson Consulting. Marcie was also one of the speakers. I watched her presentation to see if I could pick up any public speaking tips, but I left learning some technical things. A) You can do a DataSet.ReadXml and pass it an url, not just a filename. B) The file/url you point it at can be any reasonably formed xml document - not just a previously saved dataset. She loaded the RSS feed from the Code Project. Cool. In practice, an untyped DS does lots of inferring which can be problematic so stay tuned for a fully fleshed out tip on doing some typed DS loading of XML docs.
  • Mike Flasko has posted some pictures from VS Live. Mike is on the Imagine Cup Canadian winning team. Be sure to check out the sub folder from our party. Elisa Johnson and Jason Kemp also from the team were there. A very nice group of people I was glad to meet.
  • Thanks to Billy Hollis, Keith Pleas, Paul Yucknovic??, Rob Windsor, David Totzke, Chris Kinsman and of course the rest of the ObjectSharp clan for coming out on the town.


OpenSource Project for Testing Microsoft Software

Over the past few months, when I question how something works in the .NET Framework (or when somebody asks me).....I have been creating NUnit tests to verify the behaviour of some class and/or methods in the .NET Framework. Initially it is just to observe the behaviour or verify some assumptions, but by the time I'm finished, I usually inject various Assertions into my tests to tighten them up. These now serve as a test bed for me moving to a new version (or even old versions) of the .NET Framework. I can answer the question: Are any of my assumptions about how the 1.1 framework works broken in 1.2? 2.0? 9.0? etc.

I'm building up a nice collection and I might publish my work. But it struck me that this could be an open source project. In fact, I think it should be an open source project and I think it should be started by Microsoft....and not necessarily for the .NET Framework alone - but that would be an easy place to start.

Microsoft has faced increasing pressures over security and quality of their software - to the point that they've actually made windows source code available to key customers, governments and MVP's. I think that's a bit risky if you ask me. I think it is also a bit hypocritical to point the finger at Linux for being “more hackable because source code is available“ but at the same time make your own source code available to the chinese government.

But why not publish the source code to unit tests (say NUnit fixtures) in an open source format for the community to contribute to. When one of these security firms finds a hole in some MS software, they could create an NUnit test to expose it and submit it to Microsoft to fix, and then make the code for that NUnit test part of the open source project.

Instead of publishing source code, which is really meaningless to give people any kind of comfort in the code, publishing unit tests is publishing assumptions and expectations about what software is supposed to do and how it is supposed to behave. I would think this would become more important over time especially moving towards WinFx and Longhorn.