Versions in Services

One of my colleagues, John Lam, has been starting down the services road lately. We were kicking around some of the problems that can be encountered by SOA developers and John made an interesting comment. He said that the problems we were discussing had already been solved...in COM/DCOM.

This made me sit up and think. I'd been wrestling with how to best deal with versioning in services for while now. So I ask him. How does COM handle versions? The answer I got surprised me. It doesn't. Coming from Chapter 3 of Essential COM (thank you Mr. Box), COM interfaces are frequently given a different CLSID. This allows "clients to indicate explicitly which version is required." Even if the new COM interface is an extension of the old one. In fact, there is a function (CoTreatAsClass) whose purpose is to route instantiation requests from the old CLSID to the new one.

In other words, there is no real 'versioning' in COM. Each version is a class unto itself. It just happens to have the same interface or an extension of same.

So let's apply this to SOA. In the .NET world, an interface is roughly the equivalent of an ASMX file. So to create a different version of a service, create a different ASMX file, copying/changing/adding the web methods as needed.

The real challenge is ensuring that the second tenet of service orientation (services are autonomous, something I've blogged about here and here) is adhered to. Each 'version' service must have a certain level of autonomy over the underlying data. It is important (actually, critical) to eliminate any 'side effects'when designing the different versions . If a client executing the new version of the service causes the old version of the service to break in any way, shape or form, then autonomy is violated and you will have trouble on your hands.

So there you have it. No versions of services. Instead, just create a new service that implements the modified interface. If nothing else, this is a good reason to implement the web service class itself using the Facade design pattern, thus keeping the content of the class to a minimum.

Classes or XML Documents

One of the Four Basic Tenets of Service Orientation states that “services share schema and contract, not class”. This means the interface for a service should be based on an XML schema.  This runs counter to the interface for a managed class, where the type

Web Services News from TechEd

It has finally happened.  After what seems to be 10 years in the making (but what was probably only a year or so), Web Services Enhancements 2.0 has been released.  It can be downloaded here.

I've been using WSE in a limited role for a while now, mostly because it greatly simplified the attachment of certificates to a web service request.  Beyond this, one of the other nice additions to the standard is it's implementation of WS-Proxy/WS-SecurityProxy.  By using these standards, it becomes easy (as opposed to just possible with WSE 1.0) to send a secure, encrypted message and ensure that the message meets a certain level of security (the policy).

Look for some examples of how to use WSE 2.0 code in this venue over the next few days.

Objects and Web Services

I've  been asked this question enough both on-line and in person that I thought a blog might be a better way to clear up a common misconception. Consider the following situation.  Your application has implemented a class as follows:

public class Dog
{
   public string Breed;
   public string Fetch()
   {
      return "stick";
   }
}

As well, the application has a web service method defined like the following:

[WebMethod]
public Dog GetDog()
{
   return new Dog();
}

So let's move to the client side. In a typical .NET application, you would add a web reference pointing to the web service that includes the GetDog method just described.  Under the covers, Visual Studio .NET invokes wsdl.exe to generate the proxy class. Because of the magic of this tool, not only is a proxy for the web service created, but so too is a class that represents the Dog class that is returned from GetDog.  This means that, on the client side, there is no need to deploy the assembly that implements the Dog class on the server-side.  Or does it.

If you open up the .cs file generated by adding the Web Reference (you might have to “show all files” in the Solution Explorer then navigate to below the Reference Map in the Web Reference - the file is called Reference.cs), you will find a class called Dog defined. Does the automatic addition of the Dog class make it easy for developers to work with web services?  Yes. Does it cause some level of confusion?  Definitely.  Which is the reason for this post.

Take a close look at the Dog class in Reference.cs.  What you will see is a class that contains all of the public properties that were exposed by the server-side Dog class.  What you will *not* see is any of the business logic that might have been implemented in the server-side Dog class.  Nor will you see any methods, public or otherwise.  In other words, these two classes are absolutely not the same class, regardless of the similarity in naming. 

What kind of problem does this cause? Say you actually deploy the server-side assembly onto the client and try to pass an object of that type (the server-side Dog class, that is) into the web service method call.  The result is that an apparently inexplicable InvalidCastException error gets thrown.  The reason for the exception is because the server-side Dog class is not the same Dog class defined in the proxy class.  That should make sense now, but it throws a lot of new web service developers for a loop.

There is a solution to the problem.  Once the proxy class has been created, edit the Reference.cs file directly.  Remove the Dog class from that file and change the return type for the GetDog method to match the server-side Dog class.  Now the InvalidCastException will go away.  Keep in mind that, should you decide to refresh the Web Reference, then you will have to make these changes again.  So when I do this kind of manipulation, I would typically use wsdl.exe manually to create the proxy class file and they add it into the project.  This eliminates the potential for wiping out the changes accidentally.

 

Setting Boundaries for Services

So what does it mean to design an autonomous service.  Based on my previous post, there are two possible issues to consider.  First, the service needs to have a life outside of the client making the request.  Second, the service needs to be self-healing in that any dependence on the actual endpoint of services that are used must be mitigated. To put this second point into an example, if Service A invokes Service B, then Service A must be capable of discovering Service B should Service B move.  Service A should not be dependent on any manually updated configuration information to use Service B. Unfortunately, neither of these two considerations really help to determine what the boundaries of an autonomous service should be. 

To get a grasp on the criteria that we use for bounding a service, consider the following hierarchy.

Service Hierarchy Diagram

Figure 1 - Service Hierarchy

The process service is a high-level interface where a single service method call invokes a series of smaller steps.  These smaller steps could be either another process or a call to a business entity service.  Eventually, at the bottom of each of the paths, there will be one or more business entity services. These business entities don't contain any data, but instead interact with a data source through a data representation layer.  Each of the blocks in the hierarchy above the level of the data source *can* be a service.  Whether they are or not is one of the questions to be answered.

Follow the data

The definition I have found most useful for identifying the boundary for a service is one across which data is passed.  If there is no data moving between the caller and the callee, there is little need for a service-based implementation. Consider a service that provides nothing but functionality with no data.  One that, for example, takes a single number and returns an array of the prime factors.  While such a service could definitely be created, the rationale for implementing it as a service is thin.  After all, the same functionality could be embedded into an assembly and deployed with an application.  Worried about being able to update it regulary?  Place it onto a web server and use zero-touch deployment to allow for dynamic updating. So when trying to define the services, follow the data. 

Given that little nugget of wisdom, take another look at the hierarchy in Figure 1.  For someone to call a process service, some data must be provided.  In particular, it needs to be passed sufficient information for the process to 'do its thing'.  Want to invoke the “CreateOrder” process service?  Give the service enough information to be able to create the order.  This means both customer and product details.  When defining the business services involved in the process (the next level in the hierarchy), the same type of examination needs to be made.  Look at the places in the process where data is passed.  These data transfer points are the starting point for boundary definition.  

Keep it Chunky

The other criteria I use for defining service boundaries is based on the relatively nebulous concept of 'chunkiness'.  The basic premise goes back to the first tenet of services.  That is, calls into a service may be expensive.  This is not surprising given that the movement of data across process or system boundaries is usually part of the process.  As a result of the potential delay, the calling applications performance is improved by keeping the number of service calls to a minimum.  This runs counter to the 'normal' coding style of setting properties and invoking methods on local objects. 

Once the data flow has been identified (the object sequence diagram is actually quite useful in this regard), look at the interactions between two classes.  If there is a series of call/response patterns that is visible, that interaction is ripe for coalescing into a single service call. 

The downside of this approach is potentially providing more information that would normally be needed.  Say that the normal call/response pattern goes something like the following:

Order o = new Order(customerId);
OrderLine ol;
ol = o.OrderLines.Add(productId1, quantity1);
ol.ShipByDate = DateTime.Now.AddDays(2);
ol = o.OrderLines.Add(productId2, quantity2);

In order to support the creation of order lines both with and without a custom shipby date, the parameter list for any service would have to change.  But there is a solution.  One of the strengths of XML is its flexibility in this regard.  The acceptible schema can be different.  These differences can then be identified programmatically and the results changed as needed.  For this reason, we usually pass XML documents as the parameter for service calls. 

The result of this is a sense of where the boundaries of a service should be. First, look at the data passed between objects.  Identify any series of calls between two objects.  Then group the data passed through these calls into a single service using an XML document as the parameter. 

Will this logic work for every possible case?  Maybe not.  But more often than you think, this kind of design breakdown will result a decent set of boundary definitions for the required services.  The one drawback frequently identified by people is that this approach does not directly consider where the data is stored.  While this is true, it is not that imperative.  Accessing a data source can either be done through a separate service (identified by this analysis process) or through local objects.  In other words, the segragation of data along business or process service boundaries is not necessarily a given. Nor, as it turns out, is it even a requirement. 

Service Autonomy?

In a previous post, I discussed the impart of the first of the Four Basic Tenets of Service Orientation on the design of a service-based application.  In this post, I consider the second tenet in the same context. 

The second tenet says that services must be autonomous.  This is probably one of the more hotly debated tenets, in part because the definition of autonomy as it applies to services does not appear to be part of the common vernacular.

If you go back to the description of this tenet as written by Don Box here, it would appear that the definition of autonomy involves independence from the client.  Specifically, Don contrasts the autonomy (that is, the independence) of services with the interdependence of objects in a OO application.  In OO, the called object is inextricably linked to the calling object through the call stack.  If the calling object goes awry (a euphemism for death by exception), the called object goes away too.  Or at least doesn't have any place to send the result back to.

A service, on the other hand, has a life outside of the calling application.  If the calling application dies while the service is doing its think, the service will continue functioning properly.  In fact, one of the guarantees of autonomy (in this sense) is the ability for a service to be called aysynchronously with no expected return value.  In this manner, it can be guaranteed that the service remains independent of the calling client.

Another view, one espoused by Rich Turner here, is that the autonomy of services defines how a service interacts with other services.  A method of Service A is invoked.  That method in turn invokes a method on Service B.  Service A is autonomous if it is not hard wired to the location or implementation details of Service B.  So Service A calls Service B.  If Service B isn't where it is expected, then Service A goes through a number of steps to discover Service B's whereabouts.  At no point does Service A require any information from Service B or any one else to try and complete the invoked method.

I have a problem with this description.  Not with the concept being described (that is, that a service, where possible, should be self-healing), but that the word 'autonomous' is used to describe it.  There is a connotation associated with antonomy of being independent.  Indeed, a number of dictionaries include 'independent' as a synonym for the word (see dictionary.com). As Rich aptly points out, no useful service is truly independent. So I don't think that autonomous really describes the state of a service.

Now the question becomes what word or words would better describe the concepts that are being conveyed here.  My own personal choice would be 'sovereign'.  While it is listed as synonomous with autonomous and independent in a number of different references, the connotation is more appropriate for both of these points.  A soverign service has an expectation of a certain level of independence. A sovereign service would be expected to survive the failure of any client.  A sovereign service would be expected to locate and negotiate communication between it and any other soverign service. Maybe I'm being picky here, but I think that considering a service to be sovereign will better prepare an architect for service-orientation. The impact that this change in viewpoint has is the topic of my next post.

Architecting with Services

As I alluded to in a previous post, I have been thinking about the architecture of services for the past couple of weeks.  This is quite different (at least as I see it) from the many descriptions of how to implement web services.  Because I'm relatively known for Web services in the user group community here in Toronto, I get asked things like “Where do my service boundaries begin and end?” and “What parameters need to be passed into a service?”.  While it is easy (and sometimes even fun) to be flip with my comments regarding, my annoying streak of professionalism means that occasionally I have to consider the ramifications of my answers.

Naturally, the basis for the answer should come from some sound theory.  For service orientation, there is little more sound than the Four Basic Tenets of Service Orientation first espoused by Don Box.  In summary, they are:

1.  Service boundaries are explicit and costly to traverse

2.  Services are autonomous

3.  Services expose schema and contract, not class and type

4.  Services negotiate using policy

While is there an on-going discussion about the completeness of these tenets, they are certainly a good starting point for this post.  In particular, I've been trying to decide if all of these tenets apply equally to the design process.  My conclusion is that they don't. Let's start with the first one: Service Boundaries are Explicit and Costly

This is a double-edged sword, from a design perspective.  First, I believe it is important to consider the possibility that the cost to traverse a boundary is high. This is a necessary (and frequently lacking) mind set to adopt when creating and using web services.  I don't believe that it should be any different with the design of the web service. 

On the other hand, if the high cost for boundary traversal is always the assumption, the design decisions may not result in the most efficient application.  In particular, consider the need for a service that provides information about a business entity, such as a customer. The kind of service I'm thinking about is one that implements basic CRUD functionality.  Now if the assumption is made that boundary traversals are costly, the exposed interface will be such that all possible information about the business entity will be passed to the service for every method.  This would reduce the chatty nature of the typical CRUD methods. 

However having a chunky CRUD service is not necessarily the most efficient use of resources.  Why pass all possible information when only a little might be sufficient.  Especially when, as would be the case in most CRUD services, the service is more likely to be a near link than a far link.  All of this leads to my slight variation on the Tenet:

Service Granularity is Proportional to Expected Latency

When designing a service-oriented application, it is certain necessary to consider the cost involved in calling across the various service boundaries.  However, it is also necessary to consider the expected latency.  In this way, the appropriate design decisions regarding the number and content of the exposed messages can be made with an eye to both flexibility and performance.

I've got some thoughts about the other tenets as they apply to design, but they will have to wait for another blog entry.  And if anyone has thoughts about my logic, I'd be happy to hear about it.

Jumping the Hurdle: Moving from OO to SOA

For all those who are interested, I have just posted the latest article in my series on service-oriented architecture, called Jumping the Hurdle: Moving from OO to SOA. Read, learn, be amused, be bored.  Feel free to let me know where on that spectrum you fall.

Benefits of SOA Article

Continuing on the trend of trying to produce a constant stream of articles, I have just published an article on some of the benefits of using SOA in a production development environment.  The idea is to help programmers provide justification to the powers that be regarding the benefits of SOA so that, just maybe, it might become part of a small pilot program.

Let me know if it serves its purpose for anyone. ;)

The First Article of the New Year

Just to let everyone know, I have just posted my first article of the new year, A Spoonful of Web Service's Alphabet Soup.  It is the first in a series that intends to delve into the details of creating a commercial-grade services-oriented architecture.