Building Cathedrals, Bazaar's, and Mystery Houses

Almost since software development has been a profession has the practice of software development been compared to the construction industry.

In this sphere of analogy, I often use the contrast in the process of architect cathedrals against the evolution of a bazaar during my lectures on software engineering. I often use this analogy in comparing waterfall-style, can't-have-too-much-uml monolithic architecture with a more organically evolving service oriented architecture.

I've heard others use the cathedral and the bazaar analogy for a few other software engineering comparisons, but I believe the first use can be attributed to Eric S. Raymond's book "The Cathedral and the Bazaar" where he observes how Linux was built with an open source model.

One must be careful not to use Bazaar-style engineering as an excuse for not maintaining a strong engineering discipline and a thoughtful planning process. Otherwise you risk having your Bazaar end up like the Winchester Mystery House, another common software-construction analogy, one you don't want to have your software compared to.

.NET Applications on a Mac with WPF/E

As you might be aware, in sequence with Vista, Microsoft released .NET 3.0. This is an addtional set of libraries on top of .NET 2.0 - so your existing apps continue to work and the runtime CLR is otherwise unaffected.

This incremental set of .NET 3.0 libraries include usful parts for building distributed applications in new ways:

  • Windows Communication Foundation
  • Windows Workflow
  • Windows Presentation foundation

It should be noted that although these are released and pre-deployed with Vista - they are backward compatible to run on XP and 2003.

Windows Presentation Foundation (WPF) is for building very rich (i.e. 2D, 3D, animation, video & audio) client side applications and includes a markup language (XAML) to help out with that.

MS is already at work on a new derivative of WPF, namely WPF/E - E = Everywhere. Everywhere is a bit of a stretch - but it will include other browsers (Firefox, Safari, etc..) and other platform(s) - namely the Mac. Not clear if this is going to run on my J2ME, blackberry or Mobile Framework yet.

So with WPF/E we have this new thing that can make rich multimedia content and run in any browser, any platform - some would say a Flash Killer. With that in mind there are a few interesting tidbits I find interesting.

  1. You can script with JavaScript, like flash. Here's a nice demo on the http://thewpfblog.com/ where a cityscape is split in half - one side Flash, the other side WPF/E and a blue disc is scripted to bounce from one side to the other using JavaScript glue.
  2. The source code - and what's published to the browser - is Markup language - namely XAML.
  3. One of the reasons we like HTML so much is that search engines can parse and index i. That makes it's easy to find stuff. Anybody who has built a fancy front end to their website in Flash knows however that this content that is pushed to the browser is binary format to be executed by the flash runtime. Therefore it is a black box and can't be indexed by search engines.

    Remember I said WPF/E uses XAML - a markup language. Well the cool part is that this is pushed out to the browser so indeed it can be indexed. No reason why a rich WPF/E document couldn't be searched by Google & Live. It should be interesting to see what happens and how engines like Google, Live, Yahoo, etc. crawl, rank, search & render results of XAML content. Hmmm, search engine wars aren't over yet.
  4. CLR Integration is one of the tricker features that Mike Harsh says they are currently scoping. Being able to use the CLR with WPF is of course a key feature, so is this going to run on the Mac? I've expected to a commercially supported CLR for the Mac for sometime, well since, the Rotor project and more recently since this Mix 06 demo of the MiniCLR.. I've seen a lot of folks excited about the possibility of cross-platform .NET development. There are obviously technical and commercial hurdles to overcome. I suspect the technical hurdles to be no less challenging than any other attempts at cross platform support.

I think if we keep our perspective on on WPF/E in the same context of Flash, we'll find good uses for this technology and enjoy a more consistent development platform from CLR in our SQL Database right through to our browsers and devices.

WPF/E is expected to ship in the first half of 2007, but the December Community Technology Preview (CTP) is downloadable now. http://www.microsoft.com/wpfe

TechEd06-Day1: Precon with Ron Jacobs ARC001

I left for TechEd bright and early at 6:30am this morning and have arrive in beautiful Boston. I've checked in, and first job is to attend Ron Jacob's Pre-Conference on Software Architecture. This is a 6 hour session broken into sub-areas. During the day, in between sections, Ron will be interviewing real-world architects on stage about the subjects he is covering during his session. Ron will be interviewing myself on the topics of Requirements, and a little later on Design Patterns.

Ron asked me specifically to not prepare anything so that I can respond to his content unfiltered, honest and open. This should be fun.

A few good links of interest:

  • Ron Jacobs blog
  • SkyScrapr.net - an interesting site that Ron's team has been working on for educating folks on Architecture. The site is new and shiny so the content is still coming.

 

Are we still talking about Stored Procedures vs. Dynamic SQL?

Rob Howard and Frans Bouma still are. And I guess, I am now too. Let's summarize a few of the facts from these counter points:

  • Any form of pre-compilation or cached query plan arguments are moot betweem SQL and Procs. Rob has some outdated information and Frans corrects that in his post.
  • Stored Procedures can offer the perf benefits if they are designed properly that Rob claims by avoiding round trips and unncessarily data transfer when trying to get computed or aggregated data out of the database.
  • Both are susceptible to SQL Injection attacks if the SQL is concatenated with parm values.

Let's talk about security. Frans thinks that Role Based security is the way to get fine grained security in your database while using embedded or dynamic SQL. Frans's solution of adding users and roles in the database is a dated technique back to client server 2 tier systems. Web-based or other wise distributed applications typically have a connection pool - and unless you are going to have a connection pool for each role, then you can't rely on SQL Server based role based security to be your cop. Frans goes on to talk about how views can be used to encapsulate security rules just like a stored procedure.

Both Frans and Rob talk about the brittleness of SQL with regards to schema changes. Rob thinks your SQL centralization/encapsulation  should occurr inside of stored procedures. Frans think you should do this in a data access component that is part of your application. Frans hasn't really explained what his application's component does specifically but it sounds like he prefers to dynamically create the SQL on the fly by reflecting on schema of entities in his application.

What both of them has avoided is any realization that talking to a SQL Server database is the same problem as talking to any external service. Whose responsibility is it to provide the encapsulation and deep understanding of the underlying database schema. The answer to that question can't be answer universally. Back in May 2005, I blogged about the notion of DatabaseAsService.

Is your database a shared service between several applications? Some folks might even go as far as to say that their database is an enterprise service. Especially in this case it makes perfect sense to encapsulate complex internal schematics inside of the single shared resource the database. This can be done with Stored Procedures or Views, but do you really want each application to have intimate knowledge of deep schema details? That's brittle way beyond the scope of a single application.

In other cases, your database is more like a file that your application persists its data and it is not a shared resource. In these cases, the database is not really a service in terms of Service Oriented Architecture principles. In fact, I'd go as far to argue in these cases that the db is such an intimate part of your application's design that there should be no “mapping“ of schema inside/outside of the database and that they could/should be the same. Go ahead and make the full set of tables/schema public to your application logic.

 

Class design considerations for extension methods and anonymous types

One of my readers was watching the DNRTV episode I did on LINQ recently and had this question: 

At some point, when you're explaining object initializers and anonymous types, you say something regarding extension methods, like how they could be used with the anonymous types. I'm not sure how that'd work: if the anonymous type gets named dynamically as something like "<Projection>F__4", and if the extension is declared during compile time as something like "method (this type)", how can we extend the dynamic class?

He is correct in pointing out the difficulty. These projection or anonymous types have dynamically declared names and when you declare an extension method, you must specify the type after "this" in the argument clause.

One thing that is not obvious is that the type specified after “this” in an extension method doesn't have to be the exact type, but can be an ancestor or interface implemented by the type you are ultimately wishing to extend.

public static float GetArea(this IPolygon shape){...}

As an example, the above extension method could be used as an extension method over anything that implements IPolygon.

The downside is that anonymous types (and linq projections) inherit directly from object and I don't expect that they will implement any special interfaces (the goal is too keep them simple for C# 3.0). What are you left to do? Create extension methods on "object". That is certainly a theoretical option I suppose, but that seems a little bit extreme.

To be honest though, try to think of real cases when you'd want to extend an anonymous type. If the type is truly anonymous, you know absolutely nothing about it, and what assumptions can you really make about it in an extension method? Truly, in some cases, you are going to choose to implement a named type instead of an anonymous type, and it appears that we'll see refactoring support in the tools to promote an anonymous type to a real type. This is a very likely scenario.

The reasonable cases that I can think of where anonymous types are the preference (i.e. I have no burning need to have a named type) but still need (and can) extend them, is when they are used in the context of a collection or another generic type.

For a good example of that, let's take a look at some of the extension methods provided by Linq itself.

  public static class Sequence {

    public static IEnumerable<T> Where<T>(

             this IEnumerable<T> source,

                  Func<T, bool> predicate) {

      foreach (T item in source)

        if (predicate(item))

          yield return item;

    }

  }

Consider that this is an Extension method for anything IEnumerable<T>. In this case, we're using this extension method against an IEnumerable collection of type <T> - a generic. That generic type could be an anonymous type. But the important information here is that we know something more about the anonymous type here and that is that it's used inside of an Enumerable collection, and hence we can provide the value of iterating through it in the foreach, evaluating some criteria, and yielding the items that pass the criteria into another collection.

Introducing Windows Workflow Foundation (and you thought you wouldn't have to learn BizTalk)

If you're not using BizTalk today in your applications, probably the #1 reason is that you can't afford it. “It's overkill” is probably a big reason as well, but I consider that a variation. Seriously, how many applications do you write that don't have some component of workflow? Maybe you don't, but if it was baked into the framework, and you didn't have to install (and pay for) a workflow engine, maybe you'd take advantage of it - no?

Windows Workflow Foundation, a new component of WinFx was announced today. Lots of great resources here. Including overviews, labs, and even an MSDN VirtualLab so you can play with this stuff without having to install it. Also keep tabs on the blogs of Scott WoodgatePaul Andrew, and of course our own Matt Meleski

You can download the beta 1 of the extension for Visual Studio 2005. Don't get too carried away yet - it's not going to be released with 2005, it will be released in the second half of 2006 (likely along with the rest of the WinFx bits).

Visual Studio Team Suite 2005 - Release Candidate Now Available

The VS Team Suite Release Candidate is now available on MSDN Subscriber Downloads. I'm downloading now - I'm assuming you'll also need the Sql Server 2005 September CTP which was also released today. Happy downloading.

 

When is a database oriented as a service?

Do you consider your database as a service? It's worthwhile to review the tenents of a service oriented architecture. The first two tenents above are probably the most relevant to my question.

If you do all of your data access through stored procedures, then you might say your database boundary is explicit.

If your database doesn't depend on other services or applications to exist properly, then you could say that your database is autonomous. That's a little tricky. Although we may use stored procedures to access functionality in our database, we may have well known  practices that we have to call the ap_decrease_inventory  proc after we call the ap_ship_order proc to make sure our that our database values are all in check. I wouldn't call our database autonomous if it has to rely on these external rules being inforced.

I'm going to avoid the discussion of the last two tenents because I think the are a bit to pure for my question. I'm really just trying to differentiate between two types of databases that I see out there. For my purposes, I refer to these as Databases as Services, and Databases as File Systems.

Databases as Services typically are well encapsulated and contain business rules. These databases might be supporting several client applications. You probably take great care in these databases, designing them carefully, perhaps with modeling tools, and encapsulating the persistence function with stored procedures, functions, triggers, etc. You may or may not have a well defined data access layer in your client applications. You might consider all the stored procs to be your data access layer, so you might call you procs directly from UI and/or business layers of your application, but that really depends on how well your client application is written. From you database, you don't really care so much since it's well protected service that operates autonomously.

Databases as File Systems are much less strategic. They serve one purpose only - to save stuff from your application. You probably/hopefully have a well defined data access layer in your application. That may even be an Object Relational Mapping tool (ORM). You probably designed the database to support the persistence of the objects in your application, and to generalize, you probably only have one application using this database. The most important thing though is that all of your business rules should be in your application(s). This type of database doesn't mean you don't have db side logic such as stored procedures or triggers. You may decide for optimization reasons that some code needs to live closer to the tables and that's okay. It's okay, so long as you realize it's harder to reuse some of that logic in higher layers of your application and you are comfortable in having your logic live in multiple platforms.

Stored Procedures are increasingly being used to add encapsulation to our database. No longer is performance the rationale for stored procedures. And increasingly, we are seeing advanced services in our databases - 4GL code such as Java and .NET managed code are making their ways into our databases. User Defined Types, Objects, and with the next version of SQL Server, we're seeing a full fledged message queue mechanism with Service Broker. You can even host web services directly in SQL Server 2005.

Is your database a service? Which camp do you fall into? Unfortunately, I think many people live somewhere in between, and that isn't by design. Most of the architectural decisions here should be motivated by where you decide to draw your boundary for strategic reasons, not for what is handy at the moment. I'd like to see people more consciously make this decision and remain committed to it. What are your thoughts?

Why the VSTS Logical Datacenter Designer (er, Deployment Designer) Sucks

I've had this question in many of the VSTS bootcamps I'm teaching across canada. “From my Application Diagram, how do I create a deployment diagram that shows my web application and database being deployed on the same box“.

So I posed the question to my friend and fellow RD Joel Semeniuk. The answer is:

with the LDD you CAN NOT represent a web site and a database server on the same logical server.

The Logical Datacenter Designer is used to create diagrams of interconnected logical servers that represent the logical structure of a datacenter.  They key here is the term “logical server.

Full post here: http://weblogs.asp.net/jsemeniuk/archive/2005/04/07/397541.aspx

My understanding (hope) was different. My understanding of the term “Logical“ was that in the datacenter diagram, a logical server was a “type“ of server, not a physical instance of a named machine. But if this is the way the LDD is going to work, then it's useless and I guess what we really need is a Physical Datacenter Designer. To be honest, I don't think we need a LDD, just a DD that works correctly. Otherwise, how the hell can you create a deployment diagram out of something that doesn't represent real machines - or at least a type of machine?

If the LDD is going to continue to work this way, then the deployment diagram (and even the LDD) start to look just like your Application Diagram. Furthermore, if a Logical Server is intended to be (possibly) aggregated with another Logical server to become a physical server, then why would you ever be allowed to put them in different zones. There is some serious impedence going on here. I seriously hope this gets fixed/repositioned before RTM. It would be sad to come this close to getting it right on a great suite of modelling tools.

More Class Designer Productivity Potential: Batch Editing.

Daniel Moth says that he's not excited about the properties box in the class designer and would prefer to use the code editor to make those kinds of changes. It may not be obvious but one of the things you can do with that properties pane that you can't do in the code editor is make multiple changes across several class or several members at the same time.

Select all of the items that you want to make a mass change to, and any common properties are show in the properties dialog. I find this useful for decorating properties of my own components with custom attributes. Perhaps I want to change a bunch of methods to Static.

Daniel mentions another limitation. There is no full signature support on the model surface in the class designer. This makes it impossible to see the differences between your overloads. In fact, overloads are all grouped together and a count is shown.

Another mass editing scenario would be to change the XML comments on a bunch of methods - for example several overloads. You can't see the individual overloaded methods - just one of them with a “+1 overloads“ next to them. Furthermore, when you change the comment for a method that is overloaded (and shown as “+2 overloads“) one would hope the comment would be applied to all of the overloads, however the comment is only applied to the first one. Hopefully this is a bug and will be fixed. I've logged it with MS in the Product Feedback Center.