Nothing like some new terms

Get ready to hear about the 'Fabric controller'. In a cloud computer environment, the fabric controller is the "O/S". It is responsible for managing the services that you have configured.

Modeling your services

The first step is to model your service. This means to define the attributes related to the deployment and execution of your service. This includes the channels and endpoints for your service (do a quick look to WCF for the definition for these terms). As well, you define the security model by identifying the roles and groups. This information is persisted as the configuration for the service.

Development Tools

The developing and testing of the service can be done using familiar tools (i.e. VS2008). There is no need to deploy to the cloud in order to test. There is also no requirement that the application be written purely in managed code. This piece of information is a bit of a clue as to what is going on under the covers. In other words, there is probably Windows running someplace.

The development environment 'simulates' the cloud computing environment. Once the application is completed, it can be 'published' to Azure. The 'package' (the bin output for the project) and the configuration file is sent to Azure. After a few minutes, your application is running live on the cloud.

Simple. At least for the "Hello World" application. :)

Visual Studio and SQL Server 2008 Conflicts

I'm just passing along some information that has been making the rounds (I found it on Guy Burstein's blog).

If you attempt to install SQL Server 2008 on a machine that has Visual Studio 2008 installed, it will fail. The requirement is to have VS 2008 SP1 installed, an update that is still about a week away from release. And you need the 'real' SP1. The same problem exists with the beta for SP1.

So, for you developer/database people out there, it looks like at least a week of waiting to get the combination on a single machine.

Follow up on Entity Framework talk at Tech Ed 2008

Last week at TechEd I gave a talk about building data access layers with the Entity Framework. I covered various approaches from not having a data access layer at all, to fully encapsulation of the entity framework - and some hybrid approaches along the way.

I gave the first instance of this on Tuesday and then a repeat on Thursday.

To those who saw the first instance of this on Tuesday....

you unfortunately got an abbreviated and disjointed version for which I apologize. After I queued up my deck about 15 minutes prior to the talk I left the room for a minute while people filed in and while I was out, one of the event staff shutdown my deck and restarted it running from a different folder on the recording machine and didn't tell me. I was about 1/3rd into my presentation when I realized that I had the wrong version of the deck. At the time, I had no idea why this version of the deck was running so I wasn't going to fumble around looking for the correct one. Given a change in the order of things - I'm not sure if changing decks at that point would have made things better or worst. I still had no idea why this had happened when I gave the talk again on Thursday but when the same thing almost happened again - this time I caught the event staff shutting down my deck and restarting it again (from an older copy). Bottom line, sorry to those folks who saw the earlier version.

The complete deck and demo project is attached. It is a branch of the sample that is part of the Entity Framework Hands on Lab that was available at the conference and which is included in the .NET 3.5 Enhancements (aka SP1) training kit. You can will need the database for that project which is not included in my down.

Download the training kit here.

 

The Entity Framework vs. The Data Access Layer (Part 1: The EF as a DAL)

In Part 0: Introduction of this series after asking the question "Does the Entity Framework replace the need for a Data Access Layer?", I waxed lengthy about the qualities of a good data access layer. Since that time I've received a quite a few emails with people interested in this topic. So without further adieu, let's get down to the question at hand.

So let's say you go ahead and create an Entity Definition model (*.edmx) in Visual Studio and have the designer generate for you a derived ObjectContext class and an entity class for each of your tables, derived from EntityObject. This one to one table mapping to entity class is quite similar to LINQ to SQL but the mapping capabilities move well beyond this to support advanced data models. This is at the heart of why the EF exists: Complex Types, Inheritance (Table per Type, Table per Inheritance Hierarchy), Multiple Entity Sets per Type, Single Entity Mapped to Two Tables, Entity Sets mapped to Stored Procedures or mapping to a hand-crafted query, expressed as either SQL or Entity SQL. EF has a good story for a conceptual model over top of our physical databases using Xml Magic in the form of the edmx file - and that's why it exists.

So to use the Entity Framework as your data access layer, define your model and then let the EdmGen.exe tool do it's thing to the edmx file at compile time and we get the csdl, ssdl, and msl files - plus the all important code generated entity classes. So using this pattern of usage for the Entity Framework, our data access layer is complete. It may not be the best option for you, so let's explore the qualities of this solution.

To be clear, the assumption here is that our data access layer in this situation is the full EF Stack: ADO.NET Entity Client, ADO.NET Object Services, LINQ to Entities, including our model (edmx, csdl, ssdl, msl) and the code generated entities and object context. Somewhere under the covers there is also the ADO.NET Provider (SqlClient, OracleClient, etc.)

image

To use the EF as our DAL, we would simply execute code similar to this in our business layer.

var db = new AdventureWorksEntities();
var activeCategories = from category in db.ProductCategory
                 where category.Inactive != true
                 orderby
category.Name
                 select category;

How Do "EF" Entities Fit In?

If you're following along, you're probably asking exactly where is this query code above being placed. For the purposes of our discussion, "business layer" could mean a business object or some sort of controller. The point to be made here is that we need to think of Entities as something entirely different from our Business Objects.

Entity != Business Object

In this model, it is up to the business object to ask the Data Access Layer to project entities, not business objects, but entities.

This is one design pattern for data access, but it is not the only one. A conventional business object that contains its own data, and does not separate that out into an entity can suffer from tight bi-directional coupling between the business and data access layer. Consider a Customer business object with a Load method. Customer.Load() would in turn instantiate a data access component, CustomerDac and call the CustomerDac's Load or Fill method. To encapsulate all the data access code to populate a customer business object, the CustomerDac.Load method would require knowledge of the structure the Customer business object and hence a circular dependency would ensue.

The workaround, if you can call it that, is to put the business layer and the data access layer in the same assembly - but there goes decoupling, unit testing and separation of concerns out the window.

Another approach is to invert the dependency. The business layer would contain data access interfaces only, and the data access layer would implement those interfaces, and hence have a reverse dependency on the business layer. Concrete data access objects are instantiated via a factory, often combined with configuration information used by an Inversion of Control container. Unfortunately, this is not all that easy to do with the EF generated ObjectContext & Entities.

Or, you do as the Entity Framework implies and separate entities from your business objects. If you've used typed DataSets in the past, this will seem familiar you to you. Substitute ObjectContext for SqlConnection and SqlDataAdapter, and the pattern is pretty much the same.

Your UI presentation layer is likely going to bind to your Entity classes as well. This is an important consideration. The generated Entity classes are partial classes and can be extended with your own code. The generated properties (columns) on an entity also have event handlers created for changing and changed events so you can also wire those up to perform some column level validation. Notwithstanding, you may want to limit your entity customizations to simple validation and keep the serious business logic in your business objects. One of these days, I'll do another blog series on handing data validation within the Entity Framework.

How does this solution stack up?

How are database connections managed?

thumbs up Using the Entity Framework natively itself, the ObjectContext takes care of opening & closing connections for you - as needed when queries are executed, and during a call to SaveChanges. You can get access to the native ADO.NET connection if need be to share a connection with other non-EF data access logic. The nice thing however is that, for the most part, connection strings and connection management are abstracted away from the developer.

thumbs down A word of caution however. Because the ObjectContext will create a native connection, you should not wait to let the garbage collector free that connection up, but rather ensure that you dispose of the ObjectContext either explicitly or with a using statement.

Are all SQL Queries centralized in the Data Access Layer?

thumbs down By default the Entity Framework dynamically generates store specific SQL on the fly and therefore, the queries are not statically located in any one central location. Even to understand the possible queries, you'd have to walk through all of your business code that hits the entity framework to understand all of the potential queries.

But why would you care? If you have to ask that question, then you don't care. But if you're a DBA, charged with the job of optimizing queries, making sure that your tables have the appropriate indices, then you want to go to one central place to see all these queries and tune them if necessary. If you care strongly enough about this, and you have the potential of other applications (perhaps written in other platforms), then you likely have already locked down the database so the only access is via Stored Procedures and hence the problem is already solved.

Let's remind ourselves that sprocs are not innately faster than dynamic SQL, however they are easier to tune and you also have the freedom of using T-SQL and temp tables to do some pre-processing of data prior to projecting results - which sometimes can be the fastest way to generate some complex results. More importantly, you can revoke all permissions to the underlying tables and only grant access to the data via Stored Procedures. Locking down a database with stored procedures is almost a necessity if your database is oriented as a service, acting as an integration layer between multiple client applications. If you have multiple applications hitting the same database, and you don't use stored procedures - you likely have bigger problems. 

In the end, this is not an insurmountable problem. If you are already using Stored Procedures, then by all means you can map those in your EDM. This seems like the best approach, but you could also embed SQL Server (or other provider) queries in your SSDL using a DefiningQuery.

Do changes in one part of the system affect others?

It's difficult to answer this question without talking about the possible changes.

thumbs up Schema Changes: The conceptual model and the mapping flexibility, even under complex scenarios is a strength of the entity framework. Compared to other technologies on the market, with the EF, your chances are as good as they're going to get that a change in the database schema will have minimal impact on your entity model, and vice versa.

thumbs up Database Provider Changes: The Entity Framework is database agnostic. It's provider model allows for easily changing from SQL Server, to Oracle, to My Sql, etc. via connection strings. This is very helpful for ISVs whose product must support running on multiple back-end databases.

thumbs down Persistence Ignorance: What if the change you want in one part of the system is to change your ORM technology? Maybe you don't want to persist to a database, but instead call a CRUD web service. In this pure model, you won't be happy. Both your Entities and your DataContext object inherit from base classes in the Entity Framework's System.Data.Objects namespace. By making references to these, littered throughout your business layer, decoupling yourself from the Entity Framework will not be an easy task.

thumbs down Unit Testing: This is only loosely related to the question, but you can't talk about PI without talking about Unit Testing. Because the generated entities do not support the use of Plain Old CLR Objects (POCO), this data access model is not easily mocked for unit testing.

Does the DAL simplify data access?

thumbs up Dramatically. Compared to classic ADO.NET, LINQ queries can be used for typed results & parameters, complete with intelli-sense against your conceptual model, with no worries about SQL injection attacks.

thumbs up As a bonus, what you do get is query composition across your domain model. Usually version 1.0 of a convention non-ORM data access layer provides components for each entity, each supporting crud behaviour. Consider a scenario where you need to show all of the Customers within a territory, and then you need to show the last 10 orders for each Customer. Now I'm not saying you'd do this, but what I've commonly seen is that while somebody might write a CustomerDac.GetCustomersByTerritory() method, and they might write an OrderDac.GetLastTenOrders(), they would almost never write a OrderDac.GetLastTenOrdersForCustomersInTerritory() method. Instead they would simply iterate over the collection of customers found by territory and call the GetLastTenOrders() over and over again. Obviously this is "good" resuse of the data access logic, however it does not perform very well.

Fortunately, through query composition and eager loading, we can cause the Entity Framework (or even LINQ to SQL) to use a nested subquery to bring back the last 10 orders for each customer in a given territory in a single round trip, single query. Wow! In a conventional data access layer you could, and should write a new method to do the same, but by writing yet another query on the order table, you'd be repeating the mapping between the table and your objects each time.

Layers, Schmayers: What about tiers?

thumbs down EDM generated entity classes are not very tier-friendly. The state of an entity, whether it is modified, new, or to be delete, and what columns have changed is managed by the ObjectContext. Once you take an entity and serialize it out of process to another tier, it is no longer tracked for updates. While you can re-attach an entity that was serialized back into the data access tier, because the entity itself does not serialize it's changed state (aka diff gram), you can not easily achieve full round trip updating in a distributed system. There are techniques for dealing with this, but it is going to add some plumbing code between the business logic and the EF...and make you wish you had a real data access layer, or something like Danny Simmons' EntityBag (or a DataSet).

Does the Data Access Layer support optimistic concurrency?

thumbs up Out of the box, yes, handily. Thanks to the ObjectContext tracking state, and the change tracking events injected into our code generated entity properties. However, keep in mind the caveat with distributed systems that you'll have more work to do if your UI is separated from your data access layer by one or more tiers.

How does the Data Access Layer support transactions?

thumbs up Because the Entity Framework builds on top of ADO.NET providers, transaction management doesn't change very much. A single call to ObjectContext.SaveChanges() will open a connection, perform all inserts, updates, and deletes across all entities that have changed, across all relationships and all in the correct order....and as you can imagine in a single transaction. To make transactions more granular than that, call SaveChanges more frequently or have multiple ObjectContext instances for each unit of work in progress. To broaden the scope of a transaction, you can manually enlist using a native ADO.NET provider transaction or by using System.Transactions.

Entity Framework Links for April, 2008

  • During the past month, Danny Simmons let us all officially know that SP1 of VS 2008/.NET Framework 3.5 will be the delivery mechanism for the Entity Framework and the Designer, and that we should see a beta of the entire SP1 very soon as well. No release dates yet.
  • Speaking of the next beta, there have been some improvements in the designer to support iterative development. Noam Ben-Ami talks about that here.
  • There is also a new ASP.NET EntityDataSource control coming in the next beta. Danny demo'd that at DevConnections, and Julie blogged about it here.
  • In April, Microsoft released the .NET 3.5 Enhancements Training Kit. This includes some preliminary labs on ASP.NET MVC, ASP.NET Dynamic Data, ASP.NET AJAX History, ASP.NET Silverlight controls, ADO.NET Data Services and last but certainly not least, the ADO.NET Entity Framework. Stay tuned for updates
  • Julie Lerman has created a spiffy pseudo-debug visualizer for Entity State. It's implemented as an extension method and not a true debug visualizer, but useful just the same.
  • Check out Ruurd Boeke's excellent post on Disconnected N-Tier objects using the Entity Framework. His sample solution is checked in to the EFContrib Project and he demonstrates using POCO classes, in his words "as persistence ignorant as I can get", serializing entities with no EF references on the clients, yet not losing full change tracking on the client - and using the same domain classes on the client and the server (one could argue this last point as being not being a desirable goal - but it does have it's place).

The Entity Framework vs. The Data Access Layer (Part 0: Introduction)

So the million dollar question is: Does the Entity Framework replace the need for a Data Access Layer? If not, what should my Data Access Layer look like if I want to take advantage of the Entity Framework? In this multi-part series, I hope to explore my thoughts on this question. I don't think there is a single correct answer. Architecture is about trade offs and the choices you make will be based on your needs and context.

In this first post, I first provide some background on the notion of a Data Access Layer as a frame of reference, and specifically, identify the key goals and objectives of a Data Access Layer.

While Martin Fowler didn't invent the pattern of layering in enterprise applications, his Patterns of Enterprise Application Architecture is a must read on the topic. Our goals for a layered design (which may often need to be traded off against each other) should include:

  • Changes to one part or layer of the system should have minimal impact on other layers of the system. This reduces the maintenance involved in unit testing, debugging, and fixing bugs and in general makes the architecture more flexible.
  • Separation of concerns between user interface, business logic, and persistence (typically in a database) also increases flexibility, maintainability and reusability.
  • Individual components should be cohesive and unrelated components should be loosely coupled. This should allow layers to be developed and maintained independently of each other using well-defined interfaces.

Now to be clear, I'm talking about a layer, not a tier. A tier is a node in a distributed system, of which may include one or more layers. But when I refer to a layer, I'm referring only to the logical separation of code that serves a single concern such as data access. It may or may not be deployed into a separate tier from the other layers of a system. We could then begin to fly off on tangential discussions of distributed systems and service oriented architecture, but I will do my best to keep this discussion focused on the notion of a layer. There are several layered application architectures, but almost all of them in some way include the notion of a Data Access Layer (DAL). The design of the DAL will be influenced should the application architecture include the distribution of the DAL into a separate tier.

In addition to the goals of any layer mentioned above, there are some design elements specific to a Data Access Layer common to the many layered architectures:

  • A DAL in our software provides simplified access to data that is stored in some persisted fashion, typically a relational database. The DAL is utilized by other components of our software so those other areas of our software do not have to be overly concerned with the complexities of that data store.
  • In object or component oriented systems, the DAL typically will populate objects, converting rows and their columns/fields into objects and their properties/attributes. this allows the rest of the software to work with data in an abstraction that is most suitable to it.
  • A common purpose of the DAL is to provide a translation between the structure or schema of the store and the desired abstraction in our software. As is often the case, the schema of a relational database is optimized for performance and data integrity (i.e. 3rd normal form) but this structure does not always lend itself well to the conceptual view of the real world or the way a developer may want to work with the data in an application. A DAL should serve as a central place for mapping between these domains such as to increase the maintainability of the software and provide an isolation between changes  in the storage schema and/or the domain of the application software. This may include the marshalling or coercing of differing data types between the store and the application software.
  • Another frequent purpose of the DAL is to provide independence between the application logic and the storage system itself such that if required, the storage engine itself could be switched with an alternative with minimal impact to the application layer. This is a common scenario for commercial software products that must work with different vendors' database engines (i.e. MS SQL Server, IBM DB/2, Oracle, etc.). With this requirement, sometimes alternate DAL's are created for each store that can be swapped out easily.  This is commonly referred to as Persistence Ignorance.

Getting a little more concrete, there are a host of other issues that also need to be considered in the implementation of a DAL:

  • How will database connections be handled? How will there lifetime be managed? A DAL will have to consider the security model. Will individual users connect to the database using their own credentials? This maybe fine in a client-server architecture where the number of users is small. It may even be desirable in those situations where there is business logic and security enforced in the database itself through the use of stored procedures, triggers, etc. It may however run incongruent to the scalability requirements of a public facing web application with thousands of users. In these cases, a connection pool may be the desired approach.
  • How will database transactions be handled? Will there be explicit database transactions managed by the data access layer or will automatic or implied transaction management systems such as COM+ Automatic Transactions, the Distributed Transaction Coordinator be used?
  • How will concurrent access to data be managed? Most modern application architecture's will rely on an optimistic concurrency  to improve scalability. Will it be the DAL's job to manage the original state of a row in this case? Can we take advantage of SQL Server's row version timestamp column or do we need to track every single column?
  • Will we be using dynamic SQL or stored procedures to communicate with our database?

As you can see, there is much to consider just in generic terms, well before we start looking at specific business scenarios and the wacky database schemas that are in the wild. All of these things can and should influence the design of your data access layer and the technology you use to implement it. In terms of .NET, the Entity Framework is just one data access technology. MS has been so kind to bless us with many others such as Linq To SQL, DataReaders, DataAdapters & DataSets, and SQL XML. In addition, there are over 30 3rd party Object Relational Mapping tools available to choose from.

Ok, so if you're  not familiar with the design goals of the Entity Framework (EF) you can read all about it here or watch a video interview on channel 9, with Pablo Castro, Britt Johnson, and Michael Pizzo. A year after that interview, they did a follow up interview here.

In the next post, I'll explore the idea of the Entity Framework replacing my data access layer and evaluate how this choice rates against the various objectives above. I'll then continue to explore alternative implementations for a DAL using the Entity Framework.

Tech-Ed 2008 U.S. Developers Session Catalog Online

website_banner_dev

If you're thinking of heading down to Orlando June 3-6, 2008 - you can check out the list of sessions which has been recently posted here.

You also have the option of rating sessions that you might be interested in. I assume this is to help plan room sizes.

I'll be doing a breakout session called "Building Next Generation Data Access Layers with the ADO.NET Entity Framework" in the Architecture Track. I'm really looking forward to it. Hope to see you there.

ObjectSharp @ the Movies

On Feb 7th ObjectSharp hosted a free event at the Paramount Scotia Bank Theatre downtown. The morning after one of this winters largest snowfalls 160 people showed up for a morning of what's new in Visual Studio 2008. I did a piece on VSTS and TFS 2008. My slides are attached.

Even with the snow the event turnout was great, thanks to everyone who ventured out. I hope you enjoyed it as much as we did.

A few pictures were taken at the event you can view them here.

Hello ObjectSharp world!

I'm keeping a blog these days at http://www.robburke.net that has lots of .NET developer content, and I'm trying to syndicate the techie bits of that blog here.  Easier said, right?  Has anyone out there managed to get a Wordpress blog to cross-post to Community Server?

I recently posted some follow-up to ObjectSharp's VS2008 At the Movies event on that blog, so please head over there for more about what's new in WPF 3.5 and Visual Studio 2008.

[Update 13 Feb: I haven't managed to find any way to get the cross-posting working. Pint(s) of Guinness promised for anyone who can help me with this...]

Visual Studio 2008 and Windows Server 2008 for aspiring Architects

imageAre you an architect or an aspiring architect interested in learning what's new this year for the MS platform?

On February 11th, 2008 come to the MS Canada Office in Mississauga and visit yours truly and some other dazzling speakers to learn more about Visual Studio 2008 and Windows Server 2008.

You can choose either the morning or afternoon session as your schedule permits. Only a 1/2 day out of your busy schedule and you'll know everything you need! Ok, well maybe not everything, but I hope that you'll be inspired to take the next steps in learning about technologies such as LINQ, Windows Communication Foundation and Windows Presentation Foundation and how you can make the most of these technologies in your applications.

Register Here.