Well it's just 1 week since DevTeach came to Toronto for the first time. What a great conference and it was my pleasure to be involved as a speaker, track-chair and attendee. The conference organizer Jean-Rene Roy just sent me a note with some of the comments from the overall evaluations. If you didn't attend this year, here's some reasons why you may want to next year:
Great conference! I especially enjoyed the up and personal nature of the conference. I was able to talk with the presenters. I spent most of my time at the agile track. Having topics that are rarely dealt with at user groups was a bonus. I enjoyed all the sessions I attended. The venue was great and the attention to little details, e.g., afternoon ice cream was appreciated.
Jean-René, thank you SO MUCH for bringing DevTeach to Toronto. It was fantastic and I will go again. Your tech chairs did a great job choosing sessions for each track. While I especially enjoyed the Agile sessions, I attended something from each track and the variety was good.
An outstanding conference! All the speakers I saw were terrific — affable, down-to-earth, talented, incredibly knowledgeable. The sessions were entertaining as well as in-depth and honest — no BS, no company line. I also met many people and had many interesting and thought provoking discussions outside the classrooms, and came away with new knowledge, ideas and inspiration. “Training you can’t get anywhere else” is an understatement.
Most of the speakers tell us 'why' and 'so what' instead of 'how'. This is what I expected and is good for developer in the long run. Please let speakers know this is good.
This is an excellent conference. I feel I updated my skills intensively effectively during these 3 days. I believe it will become a key event in .net area.
DevTeach was an amazing experience, especially for first timers. It was a good way to network with people in the industry, learn new techniques, make friends and bring home stories.
This was my first DevTeach and if I have any say in the matter, won't be my last. I had a great time, the sessions that I attended were top notch for the most part. Jean-Rene and his team deserve a hugh pat on the back for their efforts. What-ever they're getting paid - isn't half enough
What can I say. You'll definitely see me next year. I hope its still in Toronto. This was one of the BEST training conferences I've been on in quite some time. The "take-away's" from all the sessions were astounding. My mind is still spinning. Anyway, great job, nice prizes, great orgranization, absolutely no negative thoughts or comments.
This was a fantastic experience, MUCH better information than what I got from TechEd last year. TechEd's information was very visionary, things I can talk about now but not use for a few years out. DevTeach taught me things and gave me ideas I can use NOW! I LOVE THAT! The presentators were awesome, professional and very gifted at presenting their material. The only suggestions I would make are to have hot food every day (cold cut sandwiches are fine, even suggested for people at the Pre/Post Con but not for the actual event). More evening sessions (like at TechEd). I would have liked to have seen a presentation on MSBuild. PS You should have a value for the drop down of NA for hotel and accomodations if you didn't stay at the hotel.
In Part 0: Introduction of this series after asking the question "Does the Entity Framework replace the need for a Data Access Layer?", I waxed lengthy about the qualities of a good data access layer. Since that time I've received a quite a few emails with people interested in this topic. So without further adieu, let's get down to the question at hand.
So let's say you go ahead and create an Entity Definition model (*.edmx) in Visual Studio and have the designer generate for you a derived ObjectContext class and an entity class for each of your tables, derived from EntityObject. This one to one table mapping to entity class is quite similar to LINQ to SQL but the mapping capabilities move well beyond this to support advanced data models. This is at the heart of why the EF exists: Complex Types, Inheritance (Table per Type, Table per Inheritance Hierarchy), Multiple Entity Sets per Type, Single Entity Mapped to Two Tables, Entity Sets mapped to Stored Procedures or mapping to a hand-crafted query, expressed as either SQL or Entity SQL. EF has a good story for a conceptual model over top of our physical databases using Xml Magic in the form of the edmx file - and that's why it exists.
So to use the Entity Framework as your data access layer, define your model and then let the EdmGen.exe tool do it's thing to the edmx file at compile time and we get the csdl, ssdl, and msl files - plus the all important code generated entity classes. So using this pattern of usage for the Entity Framework, our data access layer is complete. It may not be the best option for you, so let's explore the qualities of this solution.
To be clear, the assumption here is that our data access layer in this situation is the full EF Stack: ADO.NET Entity Client, ADO.NET Object Services, LINQ to Entities, including our model (edmx, csdl, ssdl, msl) and the code generated entities and object context. Somewhere under the covers there is also the ADO.NET Provider (SqlClient, OracleClient, etc.)
To use the EF as our DAL, we would simply execute code similar to this in our business layer.
var db = new AdventureWorksEntities();
var activeCategories = from category in db.ProductCategory
where category.Inactive != true
orderby category.Name
select category;
How Do "EF" Entities Fit In?
If you're following along, you're probably asking exactly where is this query code above being placed. For the purposes of our discussion, "business layer" could mean a business object or some sort of controller. The point to be made here is that we need to think of Entities as something entirely different from our Business Objects.
Entity != Business Object
In this model, it is up to the business object to ask the Data Access Layer to project entities, not business objects, but entities.
This is one design pattern for data access, but it is not the only one. A conventional business object that contains its own data, and does not separate that out into an entity can suffer from tight bi-directional coupling between the business and data access layer. Consider a Customer business object with a Load method. Customer.Load() would in turn instantiate a data access component, CustomerDac and call the CustomerDac's Load or Fill method. To encapsulate all the data access code to populate a customer business object, the CustomerDac.Load method would require knowledge of the structure the Customer business object and hence a circular dependency would ensue.
The workaround, if you can call it that, is to put the business layer and the data access layer in the same assembly - but there goes decoupling, unit testing and separation of concerns out the window.
Another approach is to invert the dependency. The business layer would contain data access interfaces only, and the data access layer would implement those interfaces, and hence have a reverse dependency on the business layer. Concrete data access objects are instantiated via a factory, often combined with configuration information used by an Inversion of Control container. Unfortunately, this is not all that easy to do with the EF generated ObjectContext & Entities.
Or, you do as the Entity Framework implies and separate entities from your business objects. If you've used typed DataSets in the past, this will seem familiar you to you. Substitute ObjectContext for SqlConnection and SqlDataAdapter, and the pattern is pretty much the same.
Your UI presentation layer is likely going to bind to your Entity classes as well. This is an important consideration. The generated Entity classes are partial classes and can be extended with your own code. The generated properties (columns) on an entity also have event handlers created for changing and changed events so you can also wire those up to perform some column level validation. Notwithstanding, you may want to limit your entity customizations to simple validation and keep the serious business logic in your business objects. One of these days, I'll do another blog series on handing data validation within the Entity Framework.
How does this solution stack up?
How are database connections managed?
Using the Entity Framework natively itself, the ObjectContext takes care of opening & closing connections for you - as needed when queries are executed, and during a call to SaveChanges. You can get access to the native ADO.NET connection if need be to share a connection with other non-EF data access logic. The nice thing however is that, for the most part, connection strings and connection management are abstracted away from the developer.
A word of caution however. Because the ObjectContext will create a native connection, you should not wait to let the garbage collector free that connection up, but rather ensure that you dispose of the ObjectContext either explicitly or with a using statement.
Are all SQL Queries centralized in the Data Access Layer?
By default the Entity Framework dynamically generates store specific SQL on the fly and therefore, the queries are not statically located in any one central location. Even to understand the possible queries, you'd have to walk through all of your business code that hits the entity framework to understand all of the potential queries.
But why would you care? If you have to ask that question, then you don't care. But if you're a DBA, charged with the job of optimizing queries, making sure that your tables have the appropriate indices, then you want to go to one central place to see all these queries and tune them if necessary. If you care strongly enough about this, and you have the potential of other applications (perhaps written in other platforms), then you likely have already locked down the database so the only access is via Stored Procedures and hence the problem is already solved.
Let's remind ourselves that sprocs are not innately faster than dynamic SQL, however they are easier to tune and you also have the freedom of using T-SQL and temp tables to do some pre-processing of data prior to projecting results - which sometimes can be the fastest way to generate some complex results. More importantly, you can revoke all permissions to the underlying tables and only grant access to the data via Stored Procedures. Locking down a database with stored procedures is almost a necessity if your database is oriented as a service, acting as an integration layer between multiple client applications. If you have multiple applications hitting the same database, and you don't use stored procedures - you likely have bigger problems.
In the end, this is not an insurmountable problem. If you are already using Stored Procedures, then by all means you can map those in your EDM. This seems like the best approach, but you could also embed SQL Server (or other provider) queries in your SSDL using a DefiningQuery.
Do changes in one part of the system affect others?
It's difficult to answer this question without talking about the possible changes.
Schema Changes: The conceptual model and the mapping flexibility, even under complex scenarios is a strength of the entity framework. Compared to other technologies on the market, with the EF, your chances are as good as they're going to get that a change in the database schema will have minimal impact on your entity model, and vice versa.
Database Provider Changes: The Entity Framework is database agnostic. It's provider model allows for easily changing from SQL Server, to Oracle, to My Sql, etc. via connection strings. This is very helpful for ISVs whose product must support running on multiple back-end databases.
Persistence Ignorance: What if the change you want in one part of the system is to change your ORM technology? Maybe you don't want to persist to a database, but instead call a CRUD web service. In this pure model, you won't be happy. Both your Entities and your DataContext object inherit from base classes in the Entity Framework's System.Data.Objects namespace. By making references to these, littered throughout your business layer, decoupling yourself from the Entity Framework will not be an easy task.
Unit Testing: This is only loosely related to the question, but you can't talk about PI without talking about Unit Testing. Because the generated entities do not support the use of Plain Old CLR Objects (POCO), this data access model is not easily mocked for unit testing.
Does the DAL simplify data access?
Dramatically. Compared to classic ADO.NET, LINQ queries can be used for typed results & parameters, complete with intelli-sense against your conceptual model, with no worries about SQL injection attacks.
As a bonus, what you do get is query composition across your domain model. Usually version 1.0 of a convention non-ORM data access layer provides components for each entity, each supporting crud behaviour. Consider a scenario where you need to show all of the Customers within a territory, and then you need to show the last 10 orders for each Customer. Now I'm not saying you'd do this, but what I've commonly seen is that while somebody might write a CustomerDac.GetCustomersByTerritory() method, and they might write an OrderDac.GetLastTenOrders(), they would almost never write a OrderDac.GetLastTenOrdersForCustomersInTerritory() method. Instead they would simply iterate over the collection of customers found by territory and call the GetLastTenOrders() over and over again. Obviously this is "good" resuse of the data access logic, however it does not perform very well.
Fortunately, through query composition and eager loading, we can cause the Entity Framework (or even LINQ to SQL) to use a nested subquery to bring back the last 10 orders for each customer in a given territory in a single round trip, single query. Wow! In a conventional data access layer you could, and should write a new method to do the same, but by writing yet another query on the order table, you'd be repeating the mapping between the table and your objects each time.
Layers, Schmayers: What about tiers?
EDM generated entity classes are not very tier-friendly. The state of an entity, whether it is modified, new, or to be delete, and what columns have changed is managed by the ObjectContext. Once you take an entity and serialize it out of process to another tier, it is no longer tracked for updates. While you can re-attach an entity that was serialized back into the data access tier, because the entity itself does not serialize it's changed state (aka diff gram), you can not easily achieve full round trip updating in a distributed system. There are techniques for dealing with this, but it is going to add some plumbing code between the business logic and the EF...and make you wish you had a real data access layer, or something like Danny Simmons' EntityBag (or a DataSet).
Does the Data Access Layer support optimistic concurrency?
Out of the box, yes, handily. Thanks to the ObjectContext tracking state, and the change tracking events injected into our code generated entity properties. However, keep in mind the caveat with distributed systems that you'll have more work to do if your UI is separated from your data access layer by one or more tiers.
How does the Data Access Layer support transactions?
Because the Entity Framework builds on top of ADO.NET providers, transaction management doesn't change very much. A single call to ObjectContext.SaveChanges() will open a connection, perform all inserts, updates, and deletes across all entities that have changed, across all relationships and all in the correct order....and as you can imagine in a single transaction. To make transactions more granular than that, call SaveChanges more frequently or have multiple ObjectContext instances for each unit of work in progress. To broaden the scope of a transaction, you can manually enlist using a native ADO.NET provider transaction or by using System.Transactions.
So the million dollar question is: Does the Entity Framework replace the need for a Data Access Layer? If not, what should my Data Access Layer look like if I want to take advantage of the Entity Framework? In this multi-part series, I hope to explore my thoughts on this question. I don't think there is a single correct answer. Architecture is about trade offs and the choices you make will be based on your needs and context.
In this first post, I first provide some background on the notion of a Data Access Layer as a frame of reference, and specifically, identify the key goals and objectives of a Data Access Layer.
While Martin Fowler didn't invent the pattern of layering in enterprise applications, his Patterns of Enterprise Application Architecture is a must read on the topic. Our goals for a layered design (which may often need to be traded off against each other) should include:
- Changes to one part or layer of the system should have minimal impact on other layers of the system. This reduces the maintenance involved in unit testing, debugging, and fixing bugs and in general makes the architecture more flexible.
- Separation of concerns between user interface, business logic, and persistence (typically in a database) also increases flexibility, maintainability and reusability.
- Individual components should be cohesive and unrelated components should be loosely coupled. This should allow layers to be developed and maintained independently of each other using well-defined interfaces.
Now to be clear, I'm talking about a layer, not a tier. A tier is a node in a distributed system, of which may include one or more layers. But when I refer to a layer, I'm referring only to the logical separation of code that serves a single concern such as data access. It may or may not be deployed into a separate tier from the other layers of a system. We could then begin to fly off on tangential discussions of distributed systems and service oriented architecture, but I will do my best to keep this discussion focused on the notion of a layer. There are several layered application architectures, but almost all of them in some way include the notion of a Data Access Layer (DAL). The design of the DAL will be influenced should the application architecture include the distribution of the DAL into a separate tier.
In addition to the goals of any layer mentioned above, there are some design elements specific to a Data Access Layer common to the many layered architectures:
- A DAL in our software provides simplified access to data that is stored in some persisted fashion, typically a relational database. The DAL is utilized by other components of our software so those other areas of our software do not have to be overly concerned with the complexities of that data store.
- In object or component oriented systems, the DAL typically will populate objects, converting rows and their columns/fields into objects and their properties/attributes. this allows the rest of the software to work with data in an abstraction that is most suitable to it.
- A common purpose of the DAL is to provide a translation between the structure or schema of the store and the desired abstraction in our software. As is often the case, the schema of a relational database is optimized for performance and data integrity (i.e. 3rd normal form) but this structure does not always lend itself well to the conceptual view of the real world or the way a developer may want to work with the data in an application. A DAL should serve as a central place for mapping between these domains such as to increase the maintainability of the software and provide an isolation between changes in the storage schema and/or the domain of the application software. This may include the marshalling or coercing of differing data types between the store and the application software.
- Another frequent purpose of the DAL is to provide independence between the application logic and the storage system itself such that if required, the storage engine itself could be switched with an alternative with minimal impact to the application layer. This is a common scenario for commercial software products that must work with different vendors' database engines (i.e. MS SQL Server, IBM DB/2, Oracle, etc.). With this requirement, sometimes alternate DAL's are created for each store that can be swapped out easily. This is commonly referred to as Persistence Ignorance.
Getting a little more concrete, there are a host of other issues that also need to be considered in the implementation of a DAL:
- How will database connections be handled? How will there lifetime be managed? A DAL will have to consider the security model. Will individual users connect to the database using their own credentials? This maybe fine in a client-server architecture where the number of users is small. It may even be desirable in those situations where there is business logic and security enforced in the database itself through the use of stored procedures, triggers, etc. It may however run incongruent to the scalability requirements of a public facing web application with thousands of users. In these cases, a connection pool may be the desired approach.
- How will database transactions be handled? Will there be explicit database transactions managed by the data access layer or will automatic or implied transaction management systems such as COM+ Automatic Transactions, the Distributed Transaction Coordinator be used?
- How will concurrent access to data be managed? Most modern application architecture's will rely on an optimistic concurrency to improve scalability. Will it be the DAL's job to manage the original state of a row in this case? Can we take advantage of SQL Server's row version timestamp column or do we need to track every single column?
- Will we be using dynamic SQL or stored procedures to communicate with our database?
As you can see, there is much to consider just in generic terms, well before we start looking at specific business scenarios and the wacky database schemas that are in the wild. All of these things can and should influence the design of your data access layer and the technology you use to implement it. In terms of .NET, the Entity Framework is just one data access technology. MS has been so kind to bless us with many others such as Linq To SQL, DataReaders, DataAdapters & DataSets, and SQL XML. In addition, there are over 30 3rd party Object Relational Mapping tools available to choose from.
Ok, so if you're not familiar with the design goals of the Entity Framework (EF) you can read all about it here or watch a video interview on channel 9, with Pablo Castro, Britt Johnson, and Michael Pizzo. A year after that interview, they did a follow up interview here.
In the next post, I'll explore the idea of the Entity Framework replacing my data access layer and evaluate how this choice rates against the various objectives above. I'll then continue to explore alternative implementations for a DAL using the Entity Framework.