QuickSort in Functional C#

Here's a very simple QuickSort algorithm from Wikipedia written in a very Functional way of C#. Notice that it looks very close to the pseudo-code algorithm shown in Wikipedia. The pivot used is just sequence.First(). You can replace the pivot by some random position if you wish.

   1:          public static IEnumerable<T> QuickSort<T>(this IEnumerable<T> sequence) where T : IComparable<T>
   2:          {
   3:              return !sequence.Any()
   4:                         ? sequence
   5:                         : ((from x in sequence.Skip(1)
   6:                             where x.CompareTo(sequence.First()) < 0
   7:                             select x)
   8:                               .QuickSort()
   9:                           ).Concat(new[] { sequence.First() }).Concat((from x in sequence.Skip(1)
  10:                                                                        where x.CompareTo(sequence.First()) >= 0
  11:                                                                        select x).QuickSort());
  12:          }

I tried making this more fun by using the Fix Point Generator which I mentioned previously, and converted that code into a Func<IEnumerable<T>> instead. Here's the code.

   1:          public static Func<T, T> Fix<T>(Func<Func<T, T>, Func<T, T>> F)
   2:          {
   3:              return x => F(Fix(F))(x);
   4:          }
   6:          static void Main()
   7:          {
   8:              var quicksort = Fix<IEnumerable<int>>(
   9:                  qsort =>
  10:                  sequence =>
  11:                  !sequence.Any()
  12:                      ? sequence
  13:                      : qsort(from x in sequence.Skip(1)
  14:                              where x.CompareTo(sequence.First()) < 0
  15:                              select x)
  16:                            .Concat(new[] { sequence.First() })
  17:                            .Concat(qsort(from x in sequence.Skip(1)
  18:                                          where x.CompareTo(sequence.First()) >= 0
  19:                                          select x)));
  20:          }

This is a perfect example of how you can use the Fix Point Generator to create your recursive functions on the fly. If you want to make it reusable with an extension method, you can easily convert it to the extension method, or a normal method. Personally I feel that using the Fix Point Generator is more intuitive and slightly more readable as oppose to the extension method way of doing things.

One point to note is that the "Concat" extension only accepts an IEnumerable<T> and not a single item. You can solve this by the above, "new[] { sequence.First() }" which is a cleaner way if you don't want to write your own extension method (which you could too).

Crazy stuff eh?

The Entity Framework vs. The Data Access Layer (Part 0: Introduction)

So the million dollar question is: Does the Entity Framework replace the need for a Data Access Layer? If not, what should my Data Access Layer look like if I want to take advantage of the Entity Framework? In this multi-part series, I hope to explore my thoughts on this question. I don't think there is a single correct answer. Architecture is about trade offs and the choices you make will be based on your needs and context.

In this first post, I first provide some background on the notion of a Data Access Layer as a frame of reference, and specifically, identify the key goals and objectives of a Data Access Layer.

While Martin Fowler didn't invent the pattern of layering in enterprise applications, his Patterns of Enterprise Application Architecture is a must read on the topic. Our goals for a layered design (which may often need to be traded off against each other) should include:

  • Changes to one part or layer of the system should have minimal impact on other layers of the system. This reduces the maintenance involved in unit testing, debugging, and fixing bugs and in general makes the architecture more flexible.
  • Separation of concerns between user interface, business logic, and persistence (typically in a database) also increases flexibility, maintainability and reusability.
  • Individual components should be cohesive and unrelated components should be loosely coupled. This should allow layers to be developed and maintained independently of each other using well-defined interfaces.

Now to be clear, I'm talking about a layer, not a tier. A tier is a node in a distributed system, of which may include one or more layers. But when I refer to a layer, I'm referring only to the logical separation of code that serves a single concern such as data access. It may or may not be deployed into a separate tier from the other layers of a system. We could then begin to fly off on tangential discussions of distributed systems and service oriented architecture, but I will do my best to keep this discussion focused on the notion of a layer. There are several layered application architectures, but almost all of them in some way include the notion of a Data Access Layer (DAL). The design of the DAL will be influenced should the application architecture include the distribution of the DAL into a separate tier.

In addition to the goals of any layer mentioned above, there are some design elements specific to a Data Access Layer common to the many layered architectures:

  • A DAL in our software provides simplified access to data that is stored in some persisted fashion, typically a relational database. The DAL is utilized by other components of our software so those other areas of our software do not have to be overly concerned with the complexities of that data store.
  • In object or component oriented systems, the DAL typically will populate objects, converting rows and their columns/fields into objects and their properties/attributes. this allows the rest of the software to work with data in an abstraction that is most suitable to it.
  • A common purpose of the DAL is to provide a translation between the structure or schema of the store and the desired abstraction in our software. As is often the case, the schema of a relational database is optimized for performance and data integrity (i.e. 3rd normal form) but this structure does not always lend itself well to the conceptual view of the real world or the way a developer may want to work with the data in an application. A DAL should serve as a central place for mapping between these domains such as to increase the maintainability of the software and provide an isolation between changes  in the storage schema and/or the domain of the application software. This may include the marshalling or coercing of differing data types between the store and the application software.
  • Another frequent purpose of the DAL is to provide independence between the application logic and the storage system itself such that if required, the storage engine itself could be switched with an alternative with minimal impact to the application layer. This is a common scenario for commercial software products that must work with different vendors' database engines (i.e. MS SQL Server, IBM DB/2, Oracle, etc.). With this requirement, sometimes alternate DAL's are created for each store that can be swapped out easily.  This is commonly referred to as Persistence Ignorance.

Getting a little more concrete, there are a host of other issues that also need to be considered in the implementation of a DAL:

  • How will database connections be handled? How will there lifetime be managed? A DAL will have to consider the security model. Will individual users connect to the database using their own credentials? This maybe fine in a client-server architecture where the number of users is small. It may even be desirable in those situations where there is business logic and security enforced in the database itself through the use of stored procedures, triggers, etc. It may however run incongruent to the scalability requirements of a public facing web application with thousands of users. In these cases, a connection pool may be the desired approach.
  • How will database transactions be handled? Will there be explicit database transactions managed by the data access layer or will automatic or implied transaction management systems such as COM+ Automatic Transactions, the Distributed Transaction Coordinator be used?
  • How will concurrent access to data be managed? Most modern application architecture's will rely on an optimistic concurrency  to improve scalability. Will it be the DAL's job to manage the original state of a row in this case? Can we take advantage of SQL Server's row version timestamp column or do we need to track every single column?
  • Will we be using dynamic SQL or stored procedures to communicate with our database?

As you can see, there is much to consider just in generic terms, well before we start looking at specific business scenarios and the wacky database schemas that are in the wild. All of these things can and should influence the design of your data access layer and the technology you use to implement it. In terms of .NET, the Entity Framework is just one data access technology. MS has been so kind to bless us with many others such as Linq To SQL, DataReaders, DataAdapters & DataSets, and SQL XML. In addition, there are over 30 3rd party Object Relational Mapping tools available to choose from.

Ok, so if you're  not familiar with the design goals of the Entity Framework (EF) you can read all about it here or watch a video interview on channel 9, with Pablo Castro, Britt Johnson, and Michael Pizzo. A year after that interview, they did a follow up interview here.

In the next post, I'll explore the idea of the Entity Framework replacing my data access layer and evaluate how this choice rates against the various objectives above. I'll then continue to explore alternative implementations for a DAL using the Entity Framework.

DevTeach is coming to Toronto in just a few more weeks!

I copied this blog post from Chris Dufour's blog. Chris copied it from Rob Windsor's blog. Rob copied it from Julie Lerman's blog. Julie copied it from Guy Barrette's blog. I'm sure someone will copy it from me and add their own little flair.

If you live in Toronto and don't attend DevTeach, Guy Barrette is going to make you code in Clipper for the remainder of your career (BTW - that's Clipper development with no Multi Edit and no Norton Guides. Ah, there, now you're shaking).  Seriously, DevTeach has a great lineup of speakers from Toronto and across the globe. Where else can you hear, see, touch, feel, talk to, describe your problems (IT/Dev related or not) and have a beer with these guys/gals?

And that's only half of them!!!

Need more reasons?

Keynote by Scott Hanselman
Scott is one of the most prolific, renowned and respected bloggers (
http://www.hanselman.com) and podcasters (http://www.hanselminutes.com) in the .NET world. Scott is a hands-on thinker, a renowned speaker and writer. He has written a few books, most recently with Bill Evjen and Devin Rader on Professional ASP.NET. In July 2007, he joined Microsoft as a Senior Program Manager in the Developer Division. In his new role he'll continue to explore and explain a broad portfolio of technologies, both inside and outside Microsoft. He aims to spread the good word about developing software, most often on the Microsoft stack. Before this he was the Chief Architect at Corillian Corporation, now a part of Checkfree, for 6+ years and before that he was a Principal Consultant at STEP Technology for nearly 7 years.

Silverlight 2.0 Workshop
For the first time an independent conference is having a workshop on Building Business Applications with Silverlight 2.0.  Join Rod Paddock and Jim Duffy as they give you a head start down the road to developing business-oriented Rich Internet Applications (RIA) with Microsoft Silverlight 2.0. Microsoft Silverlight 2.0 is a cross-browser, cross-platform, and cross-device plug-in positioned to revolutionize the way next generation Rich Internet Applications are developed. Microsoft’s commitment to providing an extensive platform for developers and designers to collaborate on creating the next generation of RIAs is very clear and its name is Silverlight 2.0. In this intensive, full-day workshop, Rod and Jim will share their insight and experience building business applications with Silverlight 2.0 including a review of some of the Internet’s more visible Silverlight web applications. This workshop is happening on Friday May 16 at the Hilton Toronto.

Bonus session: .NET Rocks hosts a panel May 14th at 18:00
This year the bonus session (Wednesday May 14 at 18:00) will be a panel of speakers debating the Future of .NET. Where is .NET going? How will new development influence .NET and be influenced by .NET? Join Carl Franklin and Richard Campbell from .NET Rocks as they moderate a discussion on the future directions of .NET. The panellists include individuals who have strong visions of the future of software development and the role that .NET can play in that future. Attend this session and bring your questions to get some insight into the potential future of .NET! This bonus session is free for everyone. Panelists are: Ted Neward,Oren Eini ,Scott Bellware

Party with Palermo, DevTeach Toronto Edition
Jeffrey Palermo (MVP) is hosting Monday May 12th in Toronto is acclaimed "Party with Palermo". This is the official social event  kicking off DevTeach Toronto. The event is not just for the attendees of Toronto it’s  a free event for everyone. It’s a unique chance for the attendees, speakers and locals  to meet and talk with a free beer.   The event will be held at the Menage club  location and you need to RSVP to attend. Get all the details at this link:

DevTeach Toronto is going to be a lot of fun and a great chance to learn from the best minds in the industry.  Register now - you don't want to miss out.

Developing Streaming Pipeline Components - Part 3

After covering the "why" of streaming design and some comparisons with in memory designs in Part 1, and then some technical differences in regards to moving processing logic from the pipeline components to the readers and streams they use in Part 2, this post will finally begin to dig into how the streaming processing is achieved in BizTalk.


By now from the last post (and the intro!) you will be aware that in streaming components the processing of the message is performed in the stream given to BizTalk.  I also mentioned that "Custom streams wrap other streams.  The inner stream becomes the source of data for the outer stream".  However, in order to perform some sort of processing of the data within the stream, we need to actually pay attention to (or "watch") the data that is moving through it so we know when we want to modify the data, or promote the data, or do whatever it is we want to do based on the data.

Introducing the XmlReader

This is where the XmlReader comes into play.  The XmlReader is a cursor based object that provides a forward only means of accessing XML data streams.  Due to the design of the cursor moving 1 node at a time through XML data, it is the perfect choice of class for basing streaming designs upon - as this essentially implies that only that node's details are loaded into memory at any given time.  The other benefit of the XmlReader is that it is very expressive about the node currently under the reader - you advance the reader using its methods, and its properties reflect the data that it is positioned above.  Because of its expressiveness of the data it is currently reading, the XmlReader is also excellent for being able to watch the data passing through it, and make clear decisions on when you are reading a piece of data that is particularly interesting for the task you are trying to achieve.  When this occurs, you can implement logic to perform your task - whether it be to modify the data, or to promote a value to the message context, or whatever it happens to be.  (I will go into the details of making streaming modifications to data, and other processing that you may need to do in a pipeline component in a future post.)

Due to the advantages the XmlReader offers us, it is pretty much always the class that is used in BizTalk pipelines to read incoming data that is in XML format.  However, the XmlReader only allows consuming XML data.  If our pipeline component hooks up some custom XmlReader to the inbound stream, the only thing we can do with that data is consume it.  What happens to the next pipeline component in the pipeline?  What data is going to be provided to it if the XmlReader our component used just consumed the entire stream?  Well, with what I have covered so far, it is basically going to get a reference to a stream that is already positioned at "end of stream".  Therefore, even if we managed to promote an XML node we were looking for in the data to the message context by implementing some custom logic with the XmlReader, we have essentially lost the message body!  We need to somehow be able to read the inbound stream with an XmlReader, but also maintain the data we are reading so the next component in the pipeline is also passed the XML message.

Introducing the XmlWriter, and coupling it with XmlReader

This issue is overcome by "coupling" or "chaining" the XmlReader with the XmlWriter.  The XmlWriter is for producing XML data.  (I will not go into great detail about the API of the XmlReader and XmlWriter as I assume most developers have some experience with them, and they are well documented on MSDN.)  Like the XmlReader, it is also very expressive about the data that you wish to produce with it.  It has a number of methods for writing very specific parts of XML data.  Therefore, due to the expressiveness of both the XmlReader and the XmlWriter, the node you are currently positioned over with the XmlReader has a known method which can be called on the XmlWriter to "reproduce" the XML that was consumed by the reader.  E.g. if the XmlReader is currently on an element (reader.NodeType property equals XmlNodetype.Element), you can call XmlWriter.WriteStartElement, passing the element's details from the reader - reader.Prefix, reader.LocalName, reader.NamespaceURI.

The following code snippet can be used to completely consume an XmlReader, but it will re-generate each node from the reader via the XmlWriter as it is read.  You can try this with any XML file - just make sure your XmlReader variable is called reader and your XmlWriter variable is called writer.  Try it and see.  This concept is absolutely key to how BizTalk processes messages in a streaming manner.  Also to note with this is that because it is re-creating the data as it is read, the data is essentially moving from the inbound stream to the outbound stream with very minimal memory consumption.

   1: while (reader.Read())
   2: {
   3:     switch (reader.NodeType)
   4:     {
   5:         case XmlNodeType.Element:
   6:             // Write out the element declaration
   7:             writer.WriteStartElement(reader.Prefix, reader.LocalName, reader.NamespaceURI);
   8:             // Copy the attributes of the reader
   9:             writer.WriteAttributes(reader, true);
  10:             // If the elemet is empty (eg. "<Test/>") then write the end immediately
  11:             if (reader.IsEmptyElement)
  12:                 writer.WriteEndElement();
  13:             break;
  15:         case XmlNodeType.Text:
  16:             writer.WriteString(reader.Value);
  17:             break;
  19:         case XmlNodeType.CDATA:
  20:             writer.WriteCData(reader.Value);
  21:             break;
  23:         case XmlNodeType.EntityReference:
  24:             writer.WriteEntityRef(reader.Name);
  25:             break;
  27:         case XmlNodeType.ProcessingInstruction:
  28:         case XmlNodeType.XmlDeclaration:
  29:             writer.WriteProcessingInstruction(reader.Name, reader.Value);
  30:             break;
  32:         case XmlNodeType.Comment:
  33:             writer.WriteComment(reader.Value);
  34:             break;
  36:         case XmlNodeType.DocumentType:
  37:             // Get the public and system IDs to pass to the WriteDocType method
  38:             writer.WriteDocType(reader.Name, reader.GetAttribute("PUBLIC"), reader.GetAttribute("SYSTEM"), reader.Value);
  39:             break;
  41:         case XmlNodeType.Whitespace:
  42:         case XmlNodeType.SignificantWhitespace:
  43:             writer.WriteWhitespace(reader.Value);
  44:             break;
  46:         case XmlNodeType.EndElement:
  47:             // Force a closing element (eg. "</Test>") by calling WriteFullEndElement rather than WriteEndElement
  48:             writer.WriteFullEndElement();
  49:             break;
  50:     }
  52:     writer.Flush();
  53: }


This explanation and code snippet detail how we can consume data and "keep" the data that we consumed.  But this still does not tie in with what BizTalk message parts require for their data - Streams.  It has also not explained is how we can get this "chunking" of XML to happen within BizTalk at the right time.  I.e. how do we rig up the XmlReader and XmlWriter so we are only moving data from the inbound stream to the outbound stream when the data is required in the outbound stream?  As discussed in Part 2, data is only required in our outbound stream when something else reads it.  (For our component, the stream that is returned by our component is regarded as our outbound stream and the next component's inbound stream.)  If we push the entire inbound stream (or any data for that matter) into the outbound stream before anything needs to read it, we are most likely pushing that data into memory (assuming the XmlWriter is writing to a MemoryStream).  We could use the VirtualStream with ships with BizTalk, but this also writes to memory and then to disk once a size threshold has been passed.  The real issue is that we don't need to write data to the outbound stream until it really is needed (even if that stream does not consume memory) - otherwise we are not being efficient in that we are going to "double handle" the data as we go via some form of storage (memory, disk etc).  What would be better is if we could simply hand the next stream its data directly when it wants it.

This leads into the next post topic - the design of the BizTalk stream stack.  This will cover how the issues in the last paragraph are overcome.  I think it should be the last topic that needs to be covered before looking into real world examples of how to do something useful with the data!


NOTE: VirtualStream ships in both the SDK as open source and as part of the runtime.  Most developers are not aware that you can simply use the out of the box version - there is no need to compile the SDK one into your own library!

Singularity Source Code Released to CodePlex

From OSNews:

Microsoft has released source code from the Singularity research project onto Codeplex under an academic, non-commercial license. "The Singularity Research Development Kit is based on the Microsoft Research Singularity project. It includes source code, build tools, test suites, design notes, and other background materials. The Singularity RDK is for academic non-commercial use only and is governed by this license."

How cool is that? I'm going to download it and play with it! For those who don't already know, here's a description of what Singularity is.

Singularity is a research project focused on the construction of dependable systems through innovation in the areas of systems, languages, and tools. We are building a research operating system prototype (called Singularity), extending programming languages, and developing new techniques and tools for specifying and verifying program behavior.

Advances in languages, compilers, and tools open the possibility of significantly improving software. For example, Singularity uses type-safe languages and an abstract instruction set to enable what we call Software Isolated Processes (SIPs). SIPs provide the strong isolation guarantees of OS processes (isolated object space, separate GCs, separate runtimes) without the overhead of hardware-enforced protection domains. In the current Singularity prototype SIPs are extremely cheap; they run in ring 0 in the kernel’s address space.

Singularity uses these advances to build more reliable systems and applications. For example, because SIPs are so cheap to create and enforce, Singularity runs each program, device driver, or system extension in its own SIP. SIPs are not allowed to share memory or modify their own code. As a result, we can make strong reliability guarantees about the code running in a SIP. We can verify much broader properties about a SIP at compile or install time than can be done for code running in traditional OS processes. Broader application of static verification is critical to predicting system behaviour and providing users with strong guarantees about reliability.

Presentation: Overview of ASP.NET 3.5 Extensions

The presentation I did for the Toronto Code Camp, "Overview of ASP.NET 3.5 Extensions", is up on my site.

Download it here.

VS 2008 at The Movies, Feb 7, 2008 Toronto Paramount

Posters_Codefather Our designer is having a field-day with this "at the Movies" theme for our upcoming review of Visual Studio 2008 being held Feb 7th from 8:30am-12:00pm @ the Paramount in Toronto. Grab a copy of this movie poster before it gets "whacked" by the lawyers.

Hope to see you there. Check out all the details after this link.

Visual Studio, SQL Server, and Windows Server 2008 Launch Events in Toronto


On February 27 in Toronto, MS Canada is hosting the official launch of the above mentioned products. The event will be all day long and in addition to a keynote from COO Kevin Turner, there will be some great breakout tracks running in parallel for IT Professionals, Developers, IT Managers, and Architects.

The event will be held at the Direct Energy Centre downtown. Of course ObjectSharp will have a booth there with some great offers for both our Training and Professional Services along with some awesome prize raffles so please stop by.

Also make sure to stop by the expert's area where several MVP's and speakers will be able to answer your individual questions including many of the MVP's from ObjectSharp.

You can also register for this event here along with all of the other cities and their events happening across Canada.

And don't forget, we're also doing a 1/2 day briefing for developers & architects on VS 2008 at the Paramount in Toronto on February 7th. You can view the details here.

Presentation: Acropolis and Composite Applications

The presentation I did for the Metro Toronto .NET User Group, "A Look into the Future of Creating Composite Applications", is up on my site.

Download it here.

Recursive Lambda with Fixed Point Generator

Reading Mads Torgersen old blog post on Recursive Lambda Expressions, I decided it was time for me to seriously play with his code and try out the fixed point generator to create recursive lambda expressions (together with the fact that I was sick and felt bored).

Taking lessons from that blog post, I created a reusable fixed point generator which we can use for anything that uses recursion. Here's the code, simply.

   1:  public static Func<T, T> Fix<T>(Func<Func<T, T>, Func<T, T>> F)
   2:  {
   3:     return x => F(Fix<T>(F))(x);
   4:  }
   6:  public static Func<T, T, T> Fix<T>(Func<Func<T, T, T>, Func<T, T, T>> F)
   7:  {
   8:     return (x, n) => F(Fix<T>(F))(x, n);
   9:  }
  11:  public static Func<T1, T2, T1> Fix<T1, T2>(Func<Func<T1, T2, T1>, Func<T1, T2, T1>> F)
  12:  {
  13:     return (x, n) => F(Fix<T1, T2>(F))(x, n);
  14:  }
  16:  public static Func<T, T, T, T> Fix<T>(Func<Func<T, T, T, T>, Func<T, T, T, T>> F)
  17:  {
  18:     return (x, n, k) => F(Fix<T>(F))(x, n, k);
  19:  }

Static Fix methods are the actual Fixed Point Generator as described in Mads' blog. This simplifies the code and makes it reusable. The method at lines 11-14 shows how 2 different input can be used. I will be using this method for the sine and cosine lambda expressions I've written.

   1:  Func<double, double> factorial = Fix<double>(fac => x => x == 0 ? 1 : x * fac(x - 1));
   2:  Func<double, int, double> sine = Fix<double, int>(sin => (x, n) => n == 0 ? x : 
   3:      sin(x, n - 1) + Math.Pow(-1, n) * Math.Pow(x, 2 * n + 1) / factorial(2 * n + 1));
   4:  Func<double, int, double> cosine = Fix<double, int>(cos => (x, n) => n == 0 ? 1 : 
   5:      cos(x, n - 1) + Math.Pow(-1, n) * Math.Pow(x, 2 * n) / factorial(2 * n));

The factorial lambda expression is the same as in Mads' blog. I've created both sine and cosine lambda expressions learning from how the fixed point generator works. I have to keep factorial as double instead of int because it'll make the precision wrong.

I used the Taylor Series in order to generate the sine and cosine function. x is the angle in radians, and n is the number of recursion to do (i.e. the more recursion, the more precision).

The basic recursion is as follows:

So that's that! Plug in your lambda expression into the Fix static method and you're done, you've got recursion.

Further testing out whether if it actually works, here's a simple testing code:

   1:  for (int i = 0; i < 20; i++)
   2:  {
   3:     Console.WriteLine(factorial(i));
   4:  }
   6:  Console.WriteLine(sine(Math.PI / 4, 10));
   7:  Console.WriteLine(Math.Sin(Math.PI / 4));
   9:  Console.WriteLine(cosine(Math.PI / 3, 10));
  10:  Console.WriteLine(Math.Cos(Math.PI / 3));
  11:  Console.ReadKey();

Here, what I do is I check my sine and cosine output with the actual Math.Sin and Math.Cos methods from the framework. It looks like it works well with n = 10 or higher, achieving the same precision as the framework.

That's that. I will play around with more of this, and create more methods out of this.

I tried generalizing the method itself to make it look better and easier to read, but the limits of Generics in C# stopped me from doing anything further. Here's what I tried to do, but failed of course.

   1:  // I need a constraint operator- to make this work, but Generics does not support this
   2:  public static Func<T, T> Factorial<T>() where T : struct, IEquatable<T>, operator-
   3:  {         
   4:      return Fix<T>(fac => x => x.Equals(0) ? 1 : x * fac(x - 1));
   5:  }

If that was successful, then the code would have been reduced to simply like this.

   1:  Factorial(5)

Doesn't that look better? If only, but there must be come other way to do this.