Setting Boundaries for Services

So what does it mean to design an autonomous service.  Based on my previous post, there are two possible issues to consider.  First, the service needs to have a life outside of the client making the request.  Second, the service needs to be self-healing in that any dependence on the actual endpoint of services that are used must be mitigated. To put this second point into an example, if Service A invokes Service B, then Service A must be capable of discovering Service B should Service B move.  Service A should not be dependent on any manually updated configuration information to use Service B. Unfortunately, neither of these two considerations really help to determine what the boundaries of an autonomous service should be. 

To get a grasp on the criteria that we use for bounding a service, consider the following hierarchy.

Service Hierarchy Diagram

Figure 1 - Service Hierarchy

The process service is a high-level interface where a single service method call invokes a series of smaller steps.  These smaller steps could be either another process or a call to a business entity service.  Eventually, at the bottom of each of the paths, there will be one or more business entity services. These business entities don't contain any data, but instead interact with a data source through a data representation layer.  Each of the blocks in the hierarchy above the level of the data source *can* be a service.  Whether they are or not is one of the questions to be answered.

Follow the data

The definition I have found most useful for identifying the boundary for a service is one across which data is passed.  If there is no data moving between the caller and the callee, there is little need for a service-based implementation. Consider a service that provides nothing but functionality with no data.  One that, for example, takes a single number and returns an array of the prime factors.  While such a service could definitely be created, the rationale for implementing it as a service is thin.  After all, the same functionality could be embedded into an assembly and deployed with an application.  Worried about being able to update it regulary?  Place it onto a web server and use zero-touch deployment to allow for dynamic updating. So when trying to define the services, follow the data. 

Given that little nugget of wisdom, take another look at the hierarchy in Figure 1.  For someone to call a process service, some data must be provided.  In particular, it needs to be passed sufficient information for the process to 'do its thing'.  Want to invoke the “CreateOrder” process service?  Give the service enough information to be able to create the order.  This means both customer and product details.  When defining the business services involved in the process (the next level in the hierarchy), the same type of examination needs to be made.  Look at the places in the process where data is passed.  These data transfer points are the starting point for boundary definition.  

Keep it Chunky

The other criteria I use for defining service boundaries is based on the relatively nebulous concept of 'chunkiness'.  The basic premise goes back to the first tenet of services.  That is, calls into a service may be expensive.  This is not surprising given that the movement of data across process or system boundaries is usually part of the process.  As a result of the potential delay, the calling applications performance is improved by keeping the number of service calls to a minimum.  This runs counter to the 'normal' coding style of setting properties and invoking methods on local objects. 

Once the data flow has been identified (the object sequence diagram is actually quite useful in this regard), look at the interactions between two classes.  If there is a series of call/response patterns that is visible, that interaction is ripe for coalescing into a single service call. 

The downside of this approach is potentially providing more information that would normally be needed.  Say that the normal call/response pattern goes something like the following:

Order o = new Order(customerId);
OrderLine ol;
ol = o.OrderLines.Add(productId1, quantity1);
ol.ShipByDate = DateTime.Now.AddDays(2);
ol = o.OrderLines.Add(productId2, quantity2);

In order to support the creation of order lines both with and without a custom shipby date, the parameter list for any service would have to change.  But there is a solution.  One of the strengths of XML is its flexibility in this regard.  The acceptible schema can be different.  These differences can then be identified programmatically and the results changed as needed.  For this reason, we usually pass XML documents as the parameter for service calls. 

The result of this is a sense of where the boundaries of a service should be. First, look at the data passed between objects.  Identify any series of calls between two objects.  Then group the data passed through these calls into a single service using an XML document as the parameter. 

Will this logic work for every possible case?  Maybe not.  But more often than you think, this kind of design breakdown will result a decent set of boundary definitions for the required services.  The one drawback frequently identified by people is that this approach does not directly consider where the data is stored.  While this is true, it is not that imperative.  Accessing a data source can either be done through a separate service (identified by this analysis process) or through local objects.  In other words, the segragation of data along business or process service boundaries is not necessarily a given. Nor, as it turns out, is it even a requirement.