Adjusting the Home Realm Discovery page in ADFS to support Email Addresses

Over on the Geneva forums a question was asked:

Does anyone have an example of how to change the HomeRealmDiscovery Page in ADFSv2 to accept an e-mail address in a text field and based upon that (actually the domain suffix) select the correct Claims/Identity Provider?

It's pretty easy to modify the HomeRealmDiscovery page, so I thought I'd give it a go.

Based on the question, two things need to be known: the email address and the home realm URI.  Then we need to translate the email address to a home realm URI and pass it on to ADFS.

This could be done a couple ways.  First it could be done by keeping a list of email addresses and their related home realms, or a list of email domains and their related home realms.  For the sake of this being an example, lets do both.

I've created a simple SQL database with three tables:

image

Each entry in the EmailAddress and Domain table have a pointer to the home realm URI (you can find the schema in the zip file below).

Then I created a new ADFS web project and added a new entity model to it:

image

From there I modified the HomeRealmDiscovery page to do the check:

//------------------------------------------------------------
// Copyright (c) Microsoft Corporation.  All rights reserved.
//------------------------------------------------------------

using System;

using Microsoft.IdentityServer.Web.Configuration;
using Microsoft.IdentityServer.Web.UI;
using AdfsHomeRealm.Data;
using System.Linq;

public partial class HomeRealmDiscovery : Microsoft.IdentityServer.Web.UI.HomeRealmDiscoveryPage
{
    protected void Page_Init(object sender, EventArgs e)
    {
    }

    protected void PassiveSignInButton_Click(object sender, EventArgs e)
    {
        string email = txtEmail.Text;

        if (string.IsNullOrWhiteSpace(email))
        {
            SetError("Please enter an email address");
            return;
        }

        try
        {
            SelectHomeRealm(FindHomeRealmByEmail(email));
        }
        catch (ApplicationException)
        {
            SetError("Cannot find home realm based on email address");
        }
    }

    private string FindHomeRealmByEmail(string email)
    {
        using (AdfsHomeRealmDiscoveryEntities en = new AdfsHomeRealmDiscoveryEntities())
        {
            var emailRealms = from e in en.EmailAddresses where e.EmailAddress1.Equals(email) select e;

            if (emailRealms.Any()) // email address exists
                return emailRealms.First().HomeRealm.HomeRealmUri;

            // email address does not exist
            string domain = ParseDomain(email);

            var domainRealms = from d in en.Domains where d.DomainAddress.Equals(domain) select d;

            if (domainRealms.Any()) // domain exists
                return domainRealms.First().HomeRealm.HomeRealmUri;

            // neither email nor domain exist
            throw new ApplicationException();
        }
    }

    private string ParseDomain(string email)
    {
        if (!email.Contains("@"))
            return email;

        return email.Substring(email.IndexOf("@") + 1);
    }

    private void SetError(string p)
    {
        lblError.Text = p;
    }
}

 

If you compare the original code, there was some changes.  I removed the code that loaded the original home realm drop down list, and removed the code to choose the home realm based on the drop down list's selected value.

You can find my code here: http://www.syfuhs.net/AdfsHomeRealm.zip

Entity Framework Many-to-Many Relationships

I was 0wn3d.  Entity Framework was kicking my butt around and I couldn’t figure out why it couldn’t recognise a many-to-many relationship in my database using an association table.  Just a error 3002 over and over…

HorizontalRuleWide

I created my database first, and was generating my model from the database.  I build up the database in Management Studio and a Visual Studio Database project, and I was all set to create my entity model from the database.  This is what I was shown as the model:

broken model

That’s not right!  I wanted a many-to-many relationship between Trip and Location!

After digging around on the internet, the entity model, and my database schema, I discovered a small error in the database that I missed:

baddefinition

I had only two columns that had foreign keys to the related tables (storing addition data in the association table will cause other problems), but I had allowed one association column to allow NULLs.

For the entity framework to recognise an association table, the association columns cannot allow NULL values.

After regenerating the model, things looked a lot better…

partialfix 

…but it is not many-to-many relationship.  I attempted to change the relationship to many-to-many, but that caused an error:

Error 3002: Problem in mapping fragments starting at line 139:Potential runtime violation of table Trips_Locations's keys (Trips_Locations.TripId): Columns (Trips_Locations.TripId) are mapped to EntitySet Trips_Locations's properties (Trips_Locations.Trips.Id) on the conceptual side but they do not form the EntitySet's key properties (Trips_Locations.Locations.Id, Trips_Locations.Trips.Id).

This error is basically telling me that my model does not match my database.  After digging some more, I realised that there was another error in the database schema: I constructed the primary key on only one association column!

In an association table, the primary key must consist of both association columns.

After correcting this error, I finally arrived at what I was looking for:

correct

Getting the Data to the Phone

A few posts back I started talking about what it would take to create a new application for the new Windows Phone 7.  I’m not a fan of learning from trivial applications that don’t touch on the same technologies that I would be using in the real world, so I thought I would build a real application that someone can use.

Since this application uses a well known dataset I kind of get lucky because I already have my database schema, which is in a reasonably well designed way.  My first step is to get it to the Phone, so I will use WCF Data Services and an Entity Model.  I created the model and just imported the necessary tables.  I called this model RaceInfoModel.edmx.  The entities name is RaceInfoEntities  This is ridiculously simple to do.

The following step is to expose the model to the outside world through an XML format in a Data Service.  I created a WCF Data Service and made a few config changes:

using System.Data.Services;
using System.Data.Services.Common;
using System;

namespace RaceInfoDataService
{
    public class RaceInfo : DataService
{ public static void InitializeService(DataServiceConfiguration config) { if (config
== null) throw new ArgumentNullException("config"); config.UseVerboseErrors
= true; config.SetEntitySetAccessRule("*", EntitySetRights.AllRead); //config.SetEntitySetPageSize("*",
25); config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2;
} } }

This too is reasonably simple.  Since it’s a web service, I can hit it from a web browser and I get a list of available datasets:

image

This isn’t a complete list of available items, just a subset.

At this point I can package everything up and stick it on a web server.  It could technically be ready for production if you were satisfied with not having any Access Control’s on reading the data.  In this case, lets say for arguments sake that I was able to convince the powers that be that everyone should be able to access it.  There isn’t anything confidential in the data, and we provide the data in other services anyway, so all is well.  Actually, that’s kind of how I would prefer it anyway.  Give me Data or Give me Death!

Now we create the Phone project.  You need to install the latest build of the dev tools, and you can get that here http://developer.windowsphone.com/windows-phone-7/.  Install it.  Then create the project.  You should see:

image

The next step is to make the Phone application actually able to use the data.  Here it gets tricky.  Or really, here it gets stupid.  (It better he fixed by RTM or else *shakes fist*)

For some reason, the Visual Studio 2010 Phone 7 project type doesn’t allow you to automatically import services.  You have to generate the service class manually.  It’s not that big a deal since my service won’t be changing all that much, but nevertheless it’s still a pain to regenerate it manually every time a change comes down the pipeline.  To generate the necessary class run this at a command prompt:

cd C:\Windows\Microsoft.NET\Framework\v4.0.30319
DataSvcutil.exe
     /uri:http://localhost:60141/RaceInfo.svc/
     /DataServiceCollection
     /Version:2.0
     /out:"PATH.TO.PROJECT\RaceInfoService.cs"

(Formatted to fit my site layout)

Include that file in the project and compile.

UPDATE: My bad, I had already installed the reference, so this won’t compile for most people.  The Windows Phone 7 runtime doesn’t have the System.Data namespace available that we need.  Therefore we need to install them…  They are still in development, so here is the CTP build http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=b251b247-70ca-4887-bab6-dccdec192f8d.

You should now have a compile-able project with service references that looks something like:

image

We have just connected our phone application to our database!  All told, it took me 10 minutes to do this.  Next up we start playing with the data.

Data as a Service and the Applications that consume it

Over the past few months I have seen quite a few really cool technologies released or announced, and I believe they have a very real potential in many markets.  A lot of companies that exist outside the realm of Software Development, rarely have the opportunity to use such technologies.

Take for instance the company I work for: Woodbine Entertainment Group.  We have a few different businesses, but as a whole our market is Horse Racing.  Our business is not software development.  We don’t always get the chance to play with or use some of the new technologies released to the market.  I thought this would be a perfect opportunity to see what it will take to develop a new product using only new technologies.

Our core customer pretty much wants Race information.  We have proof of this by the mere fact that on our two websites, HorsePlayer Interactive and our main site, we have dedicated applications for viewing Races.  So lets build a third race browser.  Since we already have a way of viewing races from your computer, lets build it on the new Windows Phone 7.

The Phone – The application

This seems fairly straightforward.  We will essentially be building a Silverlight application.  Let’s take a look at what we need to do (in no particular order):

  1. Design the interface – Microsoft has loads of guidance on following with the Metro design.  In future posts I will talk about possible designs.
  2. Build the interface – XAML and C#.  Gotta love it.
  3. Build the Business Logic that drives the views – I would prefer to stay away from this, suffice to say I’m not entirely sure how proprietary this information is
  4. Build the Data Layer – Ah, the fun part.  How do you get the data from our internal servers onto the phone?  Easy, OData!

The Data

We have a massive database of all the Races on all the tracks that you can wager on through our systems.  The data updates every few seconds relative to changes from the tracks for things like cancellations or runner odds.  How do we push this data to the outside world for the phone to consume?  We create a WCF Data Service:

  1. Create an Entities Model of the Database
  2. Create Data Service
  3. Add Entity reference to Data Service (See code below)
 
    public class RaceBrowserData : DataService
{ public static void InitializeService(DataServiceConfiguration config) { if (config
== null) throw new ArgumentNullException("config"); config.UseVerboseErrors
= true; config.SetEntitySetAccessRule("*", EntitySetRights.AllRead); //config.SetEntitySetPageSize("*",
25); config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2;
} } 

That’s actually all there is to it for the data.

The Authentication

The what?  Chances are the business will want to limit application access to only those who have accounts with us.  Especially so if we did something like add in the ability to place a wager on that race.  There are lots of ways to lock this down, but the simplest approach in this instance is to use a Secure Token Service.  I say this because we already have a user store and STS, and duplication of effort is wasted effort.  We create a STS Relying Party (The application that connects to the STS):

  1. Go to STS and get Federation Metadata.  It’s an XML document that tells relying parties what you can do with it.  In this case, we want to authenticate and get available Roles.  This is referred to as a Claim.  The role returned is a claim as defined by the STS.  Somewhat inaccurately, we would do this:
    1. App: Hello! I want these Claims for this user: “User Roles”.  I am now going to redirect to you.
    2. STS: I see you want these claims, very well.  Give me your username and password.
    3. STS: Okay, the user passed.  Here are the claims requested.  I am going to POST them back to you.
    4. App: Okay, back to our own processes.
  2. Once we have the Metadata, we add the STS as a reference to the Application, and call a web service to pass the credentials.
  3. If the credentials are accepted, we get returned the claims we want, which in this case would be available roles.
  4. If the user has the role to view races, we go into the Race view.  (All users would have this role, but adding Roles is a good thing if we needed to distinguish between wagering and non-wagering accounts)

One thing I didn’t mention is how we lock down the Data Service.  That’s a bit more tricky, and more suited for another post on the actual Data Layer itself.

So far we have laid the ground work for the development of a Race Browser application for the Windows Phone 7 using the Entity Framework and WCF Data Services, as well as discussed the use of the Windows Identity Foundation for authentication against an STS.

With any luck (and permission), more to follow.

Generic Implementation of INotifyPropertyChanged on ADO.NET Data Services (Astoria) Proxies with T4 Code Generation

IMG_4855 Last Week Mike Flasko from the ADO.NET Data Services (Astoria) Team blogged about what’s coming in V1.5 which will ship prior to VS 2010. I applaud these out of band releases.

One of the new features is support for two-way data binding in the client library generated proxy classes. These classes currently do not implement INotifyPropertyChanged events nor project into ObservableCollections out of the box.

Last week at the MVP Summit I had the chance to see a demo of this and other great things coming down the road from the broader Data Programmability Team. It seems like more and more teams are turning to T4 Templates for code generation which is great for our extensibility purposes. At first I was hopeful that the team had implemented these proxy generation changes via changing to T4 templates along with a corresponding “better” template.  Unfortunately, this is not the case and we won’t see any T4 templates in v1.5. It’s too bad – would it really have been that much more work to invest the time in implementing T4 templates than to add new switches to datasvcutil and new code generation (along with testing that code).

Anyway, after seeing some other great uses of T4 templates coming from product teams for VS 2010, I thought I would invest some of my own time to see if I couldn’t come up with a way of implementing INotifyPropertyChanged all on my own. The problem with the existing code gen is that while there are partial methods created and called for each property setter (i.e. FoobarChanged() ), there is no generic event fired that would allow us to in turn raise a InotifyPropertyChanged.PropertyChanged event. So you can manually added this for each and every property on every class – but it’s tedious.

I couldn’t have been the first person to think of doing this, and after a bit of googling, I confirmed that. Alexey Zakharov’s post on generating custom proxies with T4 has been completely ripped off, er, inspirational in this derivative work. What I didn’t like about Alexy’s solution was that it completely over wrote the proxy client. I would have preferred a solution that just implemented the partial methods in a partial class to fire the PropertyChanged event. This way, any changes, improvements, etc. to the core MS codegen can still be expected down the road. Of course, Alexey’s template is a better solution if there are indeed other things that you want to customize about the template in its entirely should you find that what you need to accomplish can’t be done with a partial class.

What I did like about Alexey’s solution is that it uses the service itself to query the service meta data directly. I had planned on using reflection to accomplish the same thing but in hindsight, that would be difficult to generate a partial class of a class I’m currently reflecting on in the same project (of course). Duh.

So what do you need to do to get this solution working?

  1. Add the MetadataHelper.tt file to the project where you have your reference/proxies to the data service. You will want to make sure there is no custom tool associated with this file – it’s just included as a reference in the next one. This file wraps up all the calls to get the meta data I’ve made a couple of small changes to Alexey’s -- Added support for Byte and Boolean (typo in AZ’s).
  2. Copy the DataServiceProxy.tt file to the same project. If you have more than one data service, you’ll want one of these files for each reference. So for starters you may want to rename it accordingly. You are going to need to edit this bad boy as well.
  3. There are two options you’ll need to specify inside of the proxy template. The MetadataUri should be the uri to your service suffixed with $metadata. I’ve found that if your service is secured with integrated authentication, then the the metadata helper won’t pass those credentials along so for the purposes of code generation you’d best leave anonymous access on. Secondly is the Namespace. You will want to use the same namespace used by your service reference. You might have to do a Show All Files and drill into the Reference.cs file to see exactly what that is. 
  4. var options = new {
        MetadataUri = "http://localhost/ObjectSharpSample.Service/SampleDataService.svc/$metadata",
        Namespace = "ObjectSharp.SampleApplication.ServiceClient.DataServiceReference"
        };

That’s it. When you save your file, should everything work, you’ll have a .cs file generate that implements through a partial class an INotifyProxyChanged interface. Something like…..

public partial class Address : INotifyPropertyChanged
{
    public event PropertyChangedEventHandler PropertyChanged;

    private void OnPropertyChanged(string property)
    {
        var handler = PropertyChanged;
        if (handler != null)
        {
            handler(this, new PropertyChangedEventArgs(property));
        }
    }

    partial void OnAddressIdChanged()
    {
        OnPropertyChanged("AddressId");
    }
    partial void OnAddressLine1Changed()
    {
        OnPropertyChanged("AddressLine1");
    }
}

Follow up on Entity Framework talk at Tech Ed 2008

Last week at TechEd I gave a talk about building data access layers with the Entity Framework. I covered various approaches from not having a data access layer at all, to fully encapsulation of the entity framework - and some hybrid approaches along the way.

I gave the first instance of this on Tuesday and then a repeat on Thursday.

To those who saw the first instance of this on Tuesday....

you unfortunately got an abbreviated and disjointed version for which I apologize. After I queued up my deck about 15 minutes prior to the talk I left the room for a minute while people filed in and while I was out, one of the event staff shutdown my deck and restarted it running from a different folder on the recording machine and didn't tell me. I was about 1/3rd into my presentation when I realized that I had the wrong version of the deck. At the time, I had no idea why this version of the deck was running so I wasn't going to fumble around looking for the correct one. Given a change in the order of things - I'm not sure if changing decks at that point would have made things better or worst. I still had no idea why this had happened when I gave the talk again on Thursday but when the same thing almost happened again - this time I caught the event staff shutting down my deck and restarting it again (from an older copy). Bottom line, sorry to those folks who saw the earlier version.

The complete deck and demo project is attached. It is a branch of the sample that is part of the Entity Framework Hands on Lab that was available at the conference and which is included in the .NET 3.5 Enhancements (aka SP1) training kit. You can will need the database for that project which is not included in my down.

Download the training kit here.

 

Visual Studio 2008 SP1 Beta & SQL Server 2008

A quick heads up to let you know that VS 2008 Service Pack 1 is now available (links below). It typically takes a couple of months from this point before we'll see a final release.

This Service Pack includes new cool feature:

One interesting point is that MS is going to simultaneously ship SQL Server 2008 which actually has a hard dependency on SP1.

I thought I’d take a moment to highlight some new features that Dev’s would care about in SQL Server 2008.

  • Change Data Capture: Async “triggers” capture the before/after snapshot of row level changes and writes them to Change Tables that you can query in your app. They aren’t real triggers as this asynchronously reads the transaction log.
  • Granular control of encryption, right through to the database level without any application changes required.
  • Resource Governor – very helpful when you allow users to write adhoc queries / reports against your OLTP database. Allows a DBA to assert resource limits & priorities.
  • Plan Freezing – allows you to lock down query plans to promote stable query plans across disparate hardware, server upgrades, etc.
  • New Date, and Time data types, no longer just DateTime types that you have to manually parse out the time or date to just get the real data you want.
  • DataTimeOffset – is a time zone aware datetime.
  • Table Value Parameters to procs – ever want to pass a result set as an arg to a proc?
  • Hierarchy ID is a new system type for storing nodes in a hierarchy….implemented as a CLR User Defined Type.
  • FileStream Data type allows blobish data to be surfaced in the database, but physically stored on the NTFS file system. ….but with complete transactional consistency with the relational data and backup integration.
  • New Geographic data support, store spatial data such as polygons, points and lines, and long/lat data types.
  • Merge SQL statement allows you to insert, or update if a row already exists.
  • New reporting services features such as access to reports from within Word & Excel, better SharePoint integration

Personally, haven't spent any time with SQL Server 2008 but that's a great set of new features that I can hardly wait to start using in real-world applications.

Downloads

· VS 2008 SP1 : http://download.microsoft.com/download/7/3/8/7382EA08-4DD6-4134-9B92-8585A5B07973/VS90sp1-KB945140-ENU.exe

· .NET 3.5 SP1 : http://download.microsoft.com/download/8/f/c/8fc1fe13-55de-4bf5-b43e-375daf01452e/dotNetFx35setup.exe

· Express 2008 with SP1:

o http://download.microsoft.com/download/F/E/7/FE754BA4-140B-413C-933F-8D35FB150F12/vbsetup.exe

o http://download.microsoft.com/download/F/E/7/FE754BA4-140B-413C-933F-8D35FB150F12/vcsetup.exe

o http://download.microsoft.com/download/F/E/7/FE754BA4-140B-413C-933F-8D35FB150F12/vcssetup.exe

o http://download.microsoft.com/download/F/E/7/FE754BA4-140B-413C-933F-8D35FB150F12/vnssetup.exe

· TFS 2008 SP1: http://download.microsoft.com/download/a/e/2/ae2eb0ff-e687-4221-9c3e-9165a942bc1c/TFS90sp1-KB949786.exe

Feedback Forum: http://go.microsoft.com/fwlink/?LinkId=119125

 

The Entity Framework vs. The Data Access Layer (Part 1: The EF as a DAL)

In Part 0: Introduction of this series after asking the question "Does the Entity Framework replace the need for a Data Access Layer?", I waxed lengthy about the qualities of a good data access layer. Since that time I've received a quite a few emails with people interested in this topic. So without further adieu, let's get down to the question at hand.

So let's say you go ahead and create an Entity Definition model (*.edmx) in Visual Studio and have the designer generate for you a derived ObjectContext class and an entity class for each of your tables, derived from EntityObject. This one to one table mapping to entity class is quite similar to LINQ to SQL but the mapping capabilities move well beyond this to support advanced data models. This is at the heart of why the EF exists: Complex Types, Inheritance (Table per Type, Table per Inheritance Hierarchy), Multiple Entity Sets per Type, Single Entity Mapped to Two Tables, Entity Sets mapped to Stored Procedures or mapping to a hand-crafted query, expressed as either SQL or Entity SQL. EF has a good story for a conceptual model over top of our physical databases using Xml Magic in the form of the edmx file - and that's why it exists.

So to use the Entity Framework as your data access layer, define your model and then let the EdmGen.exe tool do it's thing to the edmx file at compile time and we get the csdl, ssdl, and msl files - plus the all important code generated entity classes. So using this pattern of usage for the Entity Framework, our data access layer is complete. It may not be the best option for you, so let's explore the qualities of this solution.

To be clear, the assumption here is that our data access layer in this situation is the full EF Stack: ADO.NET Entity Client, ADO.NET Object Services, LINQ to Entities, including our model (edmx, csdl, ssdl, msl) and the code generated entities and object context. Somewhere under the covers there is also the ADO.NET Provider (SqlClient, OracleClient, etc.)

image

To use the EF as our DAL, we would simply execute code similar to this in our business layer.

var db = new AdventureWorksEntities();
var activeCategories = from category in db.ProductCategory
                 where category.Inactive != true
                 orderby
category.Name
                 select category;

How Do "EF" Entities Fit In?

If you're following along, you're probably asking exactly where is this query code above being placed. For the purposes of our discussion, "business layer" could mean a business object or some sort of controller. The point to be made here is that we need to think of Entities as something entirely different from our Business Objects.

Entity != Business Object

In this model, it is up to the business object to ask the Data Access Layer to project entities, not business objects, but entities.

This is one design pattern for data access, but it is not the only one. A conventional business object that contains its own data, and does not separate that out into an entity can suffer from tight bi-directional coupling between the business and data access layer. Consider a Customer business object with a Load method. Customer.Load() would in turn instantiate a data access component, CustomerDac and call the CustomerDac's Load or Fill method. To encapsulate all the data access code to populate a customer business object, the CustomerDac.Load method would require knowledge of the structure the Customer business object and hence a circular dependency would ensue.

The workaround, if you can call it that, is to put the business layer and the data access layer in the same assembly - but there goes decoupling, unit testing and separation of concerns out the window.

Another approach is to invert the dependency. The business layer would contain data access interfaces only, and the data access layer would implement those interfaces, and hence have a reverse dependency on the business layer. Concrete data access objects are instantiated via a factory, often combined with configuration information used by an Inversion of Control container. Unfortunately, this is not all that easy to do with the EF generated ObjectContext & Entities.

Or, you do as the Entity Framework implies and separate entities from your business objects. If you've used typed DataSets in the past, this will seem familiar you to you. Substitute ObjectContext for SqlConnection and SqlDataAdapter, and the pattern is pretty much the same.

Your UI presentation layer is likely going to bind to your Entity classes as well. This is an important consideration. The generated Entity classes are partial classes and can be extended with your own code. The generated properties (columns) on an entity also have event handlers created for changing and changed events so you can also wire those up to perform some column level validation. Notwithstanding, you may want to limit your entity customizations to simple validation and keep the serious business logic in your business objects. One of these days, I'll do another blog series on handing data validation within the Entity Framework.

How does this solution stack up?

How are database connections managed?

thumbs up Using the Entity Framework natively itself, the ObjectContext takes care of opening & closing connections for you - as needed when queries are executed, and during a call to SaveChanges. You can get access to the native ADO.NET connection if need be to share a connection with other non-EF data access logic. The nice thing however is that, for the most part, connection strings and connection management are abstracted away from the developer.

thumbs down A word of caution however. Because the ObjectContext will create a native connection, you should not wait to let the garbage collector free that connection up, but rather ensure that you dispose of the ObjectContext either explicitly or with a using statement.

Are all SQL Queries centralized in the Data Access Layer?

thumbs down By default the Entity Framework dynamically generates store specific SQL on the fly and therefore, the queries are not statically located in any one central location. Even to understand the possible queries, you'd have to walk through all of your business code that hits the entity framework to understand all of the potential queries.

But why would you care? If you have to ask that question, then you don't care. But if you're a DBA, charged with the job of optimizing queries, making sure that your tables have the appropriate indices, then you want to go to one central place to see all these queries and tune them if necessary. If you care strongly enough about this, and you have the potential of other applications (perhaps written in other platforms), then you likely have already locked down the database so the only access is via Stored Procedures and hence the problem is already solved.

Let's remind ourselves that sprocs are not innately faster than dynamic SQL, however they are easier to tune and you also have the freedom of using T-SQL and temp tables to do some pre-processing of data prior to projecting results - which sometimes can be the fastest way to generate some complex results. More importantly, you can revoke all permissions to the underlying tables and only grant access to the data via Stored Procedures. Locking down a database with stored procedures is almost a necessity if your database is oriented as a service, acting as an integration layer between multiple client applications. If you have multiple applications hitting the same database, and you don't use stored procedures - you likely have bigger problems. 

In the end, this is not an insurmountable problem. If you are already using Stored Procedures, then by all means you can map those in your EDM. This seems like the best approach, but you could also embed SQL Server (or other provider) queries in your SSDL using a DefiningQuery.

Do changes in one part of the system affect others?

It's difficult to answer this question without talking about the possible changes.

thumbs up Schema Changes: The conceptual model and the mapping flexibility, even under complex scenarios is a strength of the entity framework. Compared to other technologies on the market, with the EF, your chances are as good as they're going to get that a change in the database schema will have minimal impact on your entity model, and vice versa.

thumbs up Database Provider Changes: The Entity Framework is database agnostic. It's provider model allows for easily changing from SQL Server, to Oracle, to My Sql, etc. via connection strings. This is very helpful for ISVs whose product must support running on multiple back-end databases.

thumbs down Persistence Ignorance: What if the change you want in one part of the system is to change your ORM technology? Maybe you don't want to persist to a database, but instead call a CRUD web service. In this pure model, you won't be happy. Both your Entities and your DataContext object inherit from base classes in the Entity Framework's System.Data.Objects namespace. By making references to these, littered throughout your business layer, decoupling yourself from the Entity Framework will not be an easy task.

thumbs down Unit Testing: This is only loosely related to the question, but you can't talk about PI without talking about Unit Testing. Because the generated entities do not support the use of Plain Old CLR Objects (POCO), this data access model is not easily mocked for unit testing.

Does the DAL simplify data access?

thumbs up Dramatically. Compared to classic ADO.NET, LINQ queries can be used for typed results & parameters, complete with intelli-sense against your conceptual model, with no worries about SQL injection attacks.

thumbs up As a bonus, what you do get is query composition across your domain model. Usually version 1.0 of a convention non-ORM data access layer provides components for each entity, each supporting crud behaviour. Consider a scenario where you need to show all of the Customers within a territory, and then you need to show the last 10 orders for each Customer. Now I'm not saying you'd do this, but what I've commonly seen is that while somebody might write a CustomerDac.GetCustomersByTerritory() method, and they might write an OrderDac.GetLastTenOrders(), they would almost never write a OrderDac.GetLastTenOrdersForCustomersInTerritory() method. Instead they would simply iterate over the collection of customers found by territory and call the GetLastTenOrders() over and over again. Obviously this is "good" resuse of the data access logic, however it does not perform very well.

Fortunately, through query composition and eager loading, we can cause the Entity Framework (or even LINQ to SQL) to use a nested subquery to bring back the last 10 orders for each customer in a given territory in a single round trip, single query. Wow! In a conventional data access layer you could, and should write a new method to do the same, but by writing yet another query on the order table, you'd be repeating the mapping between the table and your objects each time.

Layers, Schmayers: What about tiers?

thumbs down EDM generated entity classes are not very tier-friendly. The state of an entity, whether it is modified, new, or to be delete, and what columns have changed is managed by the ObjectContext. Once you take an entity and serialize it out of process to another tier, it is no longer tracked for updates. While you can re-attach an entity that was serialized back into the data access tier, because the entity itself does not serialize it's changed state (aka diff gram), you can not easily achieve full round trip updating in a distributed system. There are techniques for dealing with this, but it is going to add some plumbing code between the business logic and the EF...and make you wish you had a real data access layer, or something like Danny Simmons' EntityBag (or a DataSet).

Does the Data Access Layer support optimistic concurrency?

thumbs up Out of the box, yes, handily. Thanks to the ObjectContext tracking state, and the change tracking events injected into our code generated entity properties. However, keep in mind the caveat with distributed systems that you'll have more work to do if your UI is separated from your data access layer by one or more tiers.

How does the Data Access Layer support transactions?

thumbs up Because the Entity Framework builds on top of ADO.NET providers, transaction management doesn't change very much. A single call to ObjectContext.SaveChanges() will open a connection, perform all inserts, updates, and deletes across all entities that have changed, across all relationships and all in the correct order....and as you can imagine in a single transaction. To make transactions more granular than that, call SaveChanges more frequently or have multiple ObjectContext instances for each unit of work in progress. To broaden the scope of a transaction, you can manually enlist using a native ADO.NET provider transaction or by using System.Transactions.

Entity Framework Links for April, 2008

  • During the past month, Danny Simmons let us all officially know that SP1 of VS 2008/.NET Framework 3.5 will be the delivery mechanism for the Entity Framework and the Designer, and that we should see a beta of the entire SP1 very soon as well. No release dates yet.
  • Speaking of the next beta, there have been some improvements in the designer to support iterative development. Noam Ben-Ami talks about that here.
  • There is also a new ASP.NET EntityDataSource control coming in the next beta. Danny demo'd that at DevConnections, and Julie blogged about it here.
  • In April, Microsoft released the .NET 3.5 Enhancements Training Kit. This includes some preliminary labs on ASP.NET MVC, ASP.NET Dynamic Data, ASP.NET AJAX History, ASP.NET Silverlight controls, ADO.NET Data Services and last but certainly not least, the ADO.NET Entity Framework. Stay tuned for updates
  • Julie Lerman has created a spiffy pseudo-debug visualizer for Entity State. It's implemented as an extension method and not a true debug visualizer, but useful just the same.
  • Check out Ruurd Boeke's excellent post on Disconnected N-Tier objects using the Entity Framework. His sample solution is checked in to the EFContrib Project and he demonstrates using POCO classes, in his words "as persistence ignorant as I can get", serializing entities with no EF references on the clients, yet not losing full change tracking on the client - and using the same domain classes on the client and the server (one could argue this last point as being not being a desirable goal - but it does have it's place).

The Entity Framework vs. The Data Access Layer (Part 0: Introduction)

So the million dollar question is: Does the Entity Framework replace the need for a Data Access Layer? If not, what should my Data Access Layer look like if I want to take advantage of the Entity Framework? In this multi-part series, I hope to explore my thoughts on this question. I don't think there is a single correct answer. Architecture is about trade offs and the choices you make will be based on your needs and context.

In this first post, I first provide some background on the notion of a Data Access Layer as a frame of reference, and specifically, identify the key goals and objectives of a Data Access Layer.

While Martin Fowler didn't invent the pattern of layering in enterprise applications, his Patterns of Enterprise Application Architecture is a must read on the topic. Our goals for a layered design (which may often need to be traded off against each other) should include:

  • Changes to one part or layer of the system should have minimal impact on other layers of the system. This reduces the maintenance involved in unit testing, debugging, and fixing bugs and in general makes the architecture more flexible.
  • Separation of concerns between user interface, business logic, and persistence (typically in a database) also increases flexibility, maintainability and reusability.
  • Individual components should be cohesive and unrelated components should be loosely coupled. This should allow layers to be developed and maintained independently of each other using well-defined interfaces.

Now to be clear, I'm talking about a layer, not a tier. A tier is a node in a distributed system, of which may include one or more layers. But when I refer to a layer, I'm referring only to the logical separation of code that serves a single concern such as data access. It may or may not be deployed into a separate tier from the other layers of a system. We could then begin to fly off on tangential discussions of distributed systems and service oriented architecture, but I will do my best to keep this discussion focused on the notion of a layer. There are several layered application architectures, but almost all of them in some way include the notion of a Data Access Layer (DAL). The design of the DAL will be influenced should the application architecture include the distribution of the DAL into a separate tier.

In addition to the goals of any layer mentioned above, there are some design elements specific to a Data Access Layer common to the many layered architectures:

  • A DAL in our software provides simplified access to data that is stored in some persisted fashion, typically a relational database. The DAL is utilized by other components of our software so those other areas of our software do not have to be overly concerned with the complexities of that data store.
  • In object or component oriented systems, the DAL typically will populate objects, converting rows and their columns/fields into objects and their properties/attributes. this allows the rest of the software to work with data in an abstraction that is most suitable to it.
  • A common purpose of the DAL is to provide a translation between the structure or schema of the store and the desired abstraction in our software. As is often the case, the schema of a relational database is optimized for performance and data integrity (i.e. 3rd normal form) but this structure does not always lend itself well to the conceptual view of the real world or the way a developer may want to work with the data in an application. A DAL should serve as a central place for mapping between these domains such as to increase the maintainability of the software and provide an isolation between changes  in the storage schema and/or the domain of the application software. This may include the marshalling or coercing of differing data types between the store and the application software.
  • Another frequent purpose of the DAL is to provide independence between the application logic and the storage system itself such that if required, the storage engine itself could be switched with an alternative with minimal impact to the application layer. This is a common scenario for commercial software products that must work with different vendors' database engines (i.e. MS SQL Server, IBM DB/2, Oracle, etc.). With this requirement, sometimes alternate DAL's are created for each store that can be swapped out easily.  This is commonly referred to as Persistence Ignorance.

Getting a little more concrete, there are a host of other issues that also need to be considered in the implementation of a DAL:

  • How will database connections be handled? How will there lifetime be managed? A DAL will have to consider the security model. Will individual users connect to the database using their own credentials? This maybe fine in a client-server architecture where the number of users is small. It may even be desirable in those situations where there is business logic and security enforced in the database itself through the use of stored procedures, triggers, etc. It may however run incongruent to the scalability requirements of a public facing web application with thousands of users. In these cases, a connection pool may be the desired approach.
  • How will database transactions be handled? Will there be explicit database transactions managed by the data access layer or will automatic or implied transaction management systems such as COM+ Automatic Transactions, the Distributed Transaction Coordinator be used?
  • How will concurrent access to data be managed? Most modern application architecture's will rely on an optimistic concurrency  to improve scalability. Will it be the DAL's job to manage the original state of a row in this case? Can we take advantage of SQL Server's row version timestamp column or do we need to track every single column?
  • Will we be using dynamic SQL or stored procedures to communicate with our database?

As you can see, there is much to consider just in generic terms, well before we start looking at specific business scenarios and the wacky database schemas that are in the wild. All of these things can and should influence the design of your data access layer and the technology you use to implement it. In terms of .NET, the Entity Framework is just one data access technology. MS has been so kind to bless us with many others such as Linq To SQL, DataReaders, DataAdapters & DataSets, and SQL XML. In addition, there are over 30 3rd party Object Relational Mapping tools available to choose from.

Ok, so if you're  not familiar with the design goals of the Entity Framework (EF) you can read all about it here or watch a video interview on channel 9, with Pablo Castro, Britt Johnson, and Michael Pizzo. A year after that interview, they did a follow up interview here.

In the next post, I'll explore the idea of the Entity Framework replacing my data access layer and evaluate how this choice rates against the various objectives above. I'll then continue to explore alternative implementations for a DAL using the Entity Framework.