I have an issue with the DataTable object. Although I think it's a cultural thing. Some people I talk to don't seem to bothered by it. I find it very annoying, mostly because it would have been so easily avoided.
So whats the problem? When a DataRow is flagged for deletion it's still part of the row collection in the DataTable. So you may now be wondering what the big deal is.
You see I spent 10 years in the PowerBuilder world. In a PowerBuilder DataWindow (Similar to the objects in ADO.NET) when a row is flagged for deletion it's stored in a separate collection. Therefore when you are processing the rows The Deleted rows are not in the collection of current rows. If you want to access the deleted rows you can do that also. Also when you get a count of rows it's the count of rows not including the deleted rows.
In a DataTable when you delete a row it remains in the rows Collection and you have to deal with them. This also means that DataTable.Rows.Count or DataTable.Count is a count of all the rows including deleted rows.
Why do I say it's cultural? My esteemed colleague Bruce Johnson and I had a brief discussion and he almost convinced me this is how it should be. Bruce said "It's only flagged as deleted why shouldn't it be in the Rows collection?" I have to say this is a fair statement. However it would make life easier for the developer if there was a DeletedRows collection. Therefore a corresponding DeletedRows.Count. This feels more natural to me.
When do you ever want to iterate through all the rows in a DataTable both deleted and not and preform the same action on them? I could make something up but it would be bogus. I have never wanted to do this. Even if you can come up with a good reason it's not going the be the common scenario. When you write a framework you should take into account the 80/20 rule. Make it easier for 80 percent of the cases and let the 20 percent do extra work.
This means that when you iterate through a collection of rows in a table you have to be sensitive to the fact that some of them may have been deleted.
There are ways to make life easier, and when .net 2.0 (Whidbey) comes along there will be even more.
What can you do about it?
You could check the row state inside your iterator like this.
For each row as dataRow in DataTable
if Datarow.RowState <> dataRowState.Delete then
Or you could get a subset of the collection like this.
For each row as dataRow in dataTable.Select("", "", DataViewRowState.CurrentRows)
Keep in mind with this solution if you are using a typed DataSet you will have to cast the select for the typed row.
For each row as OrderRow in Ctype(OrderTable.Select("","",DataViewRowState.CurrentRows), OrderRow())
I recommend wrapping up the select in a DataSetHelper method, so it looks like this.
For each row as OrderRow in Ctype(DataSetHelper.GetCurrentRows( OrderTable ), OrderRow())
What about getting the count of Current rows in the DataTable. You could wrap this little code segment up in a method in your DataSetHelper also. Just pass in a DataTable.
Public Function GetRowCount( dt as DataTable ) as integer
Dim dataView As New dataView
dataView.Table = dt
dataView.RowStateFilter = DataViewRowState.CurrentRows
return = dataView.Count
I mentioned above .net 2.0 (Whidbey) will help. How is that you ask.
There are a couple of solutions.
Using Partial types you could extend the DataSet Class to include a method that returns a collection of Current Rows. This way it's more natural to the developer. OrderTable.CurrentOrderRows
Using Iterators you could write your own iterator that only iterates thought the current rows.
Continuing on the trend of trying to produce a constant stream of articles, I have just published an article on some of the benefits of using SOA in a production development environment. The idea is to help programmers provide justification to the powers that be regarding the benefits of SOA so that, just maybe, it might become part of a small pilot program.
Let me know if it serves its purpose for anyone. ;)
Just to let everyone know, I have just posted my first article of the new year, A Spoonful of Web Service's Alphabet Soup
. It is the first in a series that intends to delve into the details of creating a commercial-grade services-oriented architecture.
This post is as much about making a mental note as it is about anything new. As part of the project that is creating an RSS feed from a VSS database, I wanted to to perfrom some caching of the feed. If you haven't worked with the ActiveX component that provides the API to VSS, trust me when I say that it's slow. And because the retrieval of the history is recursive, the slowness can be magnified by an order of magnitude. So given the lack of volitility of the data, caching seemed in order.
Now there is a fair bit of information about how to add items to the ASP.NET Cache object. The Insert method is used, followed by a number of parameters. And because there are different combinations of criteria that can be used to identify when the cached item expires, the Insert method has a number of different overloads. The one that I'm interested in is
Cache.Insert(string, object, CacheDependency, DateTime, TimeSpan)
This particular overload is used to specify a cache dependency and either an absolute or a sliding expiration time. Ignore the cache dependency, as it is not germaine to the discussion. The issue arises when trying to specify either an absolute expiration time or the sliding time span. Since these choices are mutually exclusive, my problem was how to correctly provide a null value for the unimportant parameter. In my case, it was how to define a 'null' TimeSpan, as I wanted to use an absolute expiration. It took a little bit of searching before I found the answer (which, by the way, is the reason for this post...future documentation).
To define an absolute expiration, use TimeSpan.Zero as the last value. For example:
Cache.Insert(”key”, value, null, DateTime.Now.AddMinutes(15), TimeSpan.Zero)
To define a sliding expiration, use DateTime.Now as the fourth parameter. For example:
Cache.Insert(”key”, value, null, DateTime.Now, new TimeSpan(0, 15, 0))
My instinct (although I have nothing to back it up) is that under the covers the Insert method uses both parmaeter values to determine when the expiration should take place. But regardless, I'm left wondering why this particular overload exists. Why is there not one overload for absolute expiration and another for sliding? The signatures would be different, so that's not the reason. It wouldn't conflict with one of the other signatures. I'm left scratching my head. Hopefully someone out there in the blogsphere who's reading this post will have an answer.
Even if you don't have your ear to the ground, the far-off rumble is a reminder that services are coming. First, there was the web services wave. Recently, Microsoft created another groundswell with the introduction of Indigo at the last PDC. Although the future may still be a year or more off, there is no denying it: the programming world is about to enter the era of services.
Currently, the big architectural push is towards a service-oriented architecture (SOA). While the term sounds impressive, it is important to understand both what an SOA is and how it can benefit a development team. There are a number of different links available that describe the components and concepts behind SOA, including at ondotnet.com, xml.com, and microsoft.com. Given this wealth of information, it would be redundent to cover it here (but perhaps in another article ;). Instead, I'd like to focus on some of the benefits that accrue to those who choose to follow the SOA path.
Even with that slough-off of responsibility for the concepts of SOA, it is still necessary that we have a common reference as the jumping off point for a discussion of benefits. So briefly, a service-oriented architecture is a style of application design that includes a layer of services. A service, from this limited perspective, is just a block of functionality that can be requested by other parts of the system when it is needed. These services have a published as set of methods, known as an interface. Usually, each interface will be associated with a particular business or technical domain. So there could be an interface used for authentication and another interface used to create and modify a sales order. The details of the provided functionality are not important for this article so much as the relative location of the code that implements the functions.
A word of warning to developers who are new to the services architecture (or an object-oriented environment as well). SOA is as much about a mind set as it is about coding. In order to reap any of the benefits of SOA, the application must be designed with services in mind. It is difficult, if not impossible, to convert a procedural application to services and still achieve anything remotely close to what can be gained from baking services in from the beginning. That having been said, there are usually elements of procedural applications which can be 'servicized'. This is the equivalent of refactoring the applications looking for common functionality.
And so, on to the benefits. In each case, we are talking about development teams that embrace SOA to the point that most of the development effort is focused on the creation and utilization of services.
Loosely Coupled Applications
Strictly speaking, coupling is the level of impact that two modules have on one another. From a developer's perspective, it is the odds that a change in one module will require a change in another. We have all seen tightly coupled systems. These are applications were developers are afraid to touch more procedures than necessary because they might cause the result of the application to break. The goal, whether we know it or not, is to create a loosely coupled environment. And services go a long way towards making such an environment a reality.
Consider for a moment what I consider to be the epitome of loosely coupled systems: a home entertainment system (Figure 1).
Figure 1 - A Loosely Coupled Home Entertainment System
Here are a whole bunch of different components (the exact number depends on how into electronic gadgets you are), frequently created by different manufacturers. The while the user interface can get complicated, between the devices the connections are quite simple. Just some wires that get plugged into standard sockets. Need to replace one of the element? No problem. Remove the wires connecting Component A to the system. Take Component A away. Insert Component B. Put the wires into the appropriate sockets. Turn things on and go. Talk about loosely coupled.
In software, a loosely coupled environment allows for changes to be made in how services are implemented without impacting other parts of the application. The only interaction between the application and the services is through the published interface. Which means that, so long as the interface doesn't change, how the functions are implemented are irrelevant to the calling application. Strict adherance to such an approach, an approach which is strongly enforced by SOA, can significantly reduce development by eliminate side-effect bugs caused by enhancements or bug fixes.
Another benefit accrued through the use of interfaces and the loosely coupled structure imposed by SOA is location transparency. By the time we're done, it will seem like many of the benefits achieved with SOA will come from this same place (interfaces, that is). In this instance, location transparency means that the consumer of the service doesn't care where the implementation of the service resides. Could be on the same server. Could be on a different server in the same subnet. Could be someplace across the Internet. From the caller's perspective, the location of the service has no impact on how a request for the service is made.
It might not be immediately obvious why location transparency is a good thing. And, in the most straightforward of environments, it doesn't really make a difference. However as the infrastructure supporting a set of web service becomes more complicated, this transparency increases in value. Say the implementing server needs to be moved. Since the client doesn't care where the service is, the change can be made with no change required on the client. In fact, the client can even dynamically locate the implementation of the service that offers the best response time at the moment. Again, not something that is required for every situation, but when it's needed, location transparency is certainly a nice tool to have in your arsenal.
Functional reuse has been one of the holy grails for developers since the first lines of code were written. Traditionally, however, it has been hard to achieve for a number of reasons. It is difficult for developers to find a list of the already implemented functions. Once found, there is no document that describes the expected list of parameters. If the function has been developed on a different platform or even in a different language, the need to translate (known technically as marshaling) the inbound and outbound values is daunting. All of these factors combine to make reuse more of a pipe dream than a reality.
The full-fledged implementation of a SOA can change this. The list of services can be discovered dynamically (using UDDI). The list of exposed methods, along with the required parameters and their types, are available through a WSDL document. The fact that even complex data structures can be recombined into a set of basic data types held together in an XML document makes the platform and language barrier obsolete. When designed and implemented properly, SOA provides the infrastructure that makes reuse possible. Not only possible, but because it is easy to consume services from either .NET or even the more traditional languages, developers are much more likely to use it. And ultimately, the lack of developer acceptance has always been the biggest barrier of all.
Focused Developer Roles
Almost be definition, a service-oriented architecture forces applications to have multiple layers. Each layer serves a particular purpose within the overall scope of the architecture. The basic layout of such an architecture can be seen in Figure 2.
Figure 2 - Some Layers in a SOA
So when designing an application, the various components get place into one of the levels at design time. Once coding starts, the someone is actually going to create the component. The skills necessary to develop each of the components are going to differ and the type of skill required will depend on the layer in which the component is placed. The data access layer requires knowledge of data formats, SQL, ADO.NET. The authentication layer could utilize LDAP, WS-Security, SHA or other similar technologies. I'm sure you get the idea enough to put it in play in your own environment
The benefit of this separation are found in how the development staff can be assigned to the various tasks. Developers will be able to specialize in the the technologies for a particular layer. As time goes by, they will be experts in the complex set of skills used at each level. Ultimately, by focusing the developers on a particular layer, it will even become possible to include less skilled programmers in a paired or a mentoring environment. This doesn't eliminate the need for an overarching technical lead, but it certainly decreases the reliance of a company on experienced generalists.
It has been a few paragraphs since they were mentioned, but the fact that services are only access through their interfaces is the reason for this benefit. The fact that services have published interfaces greatly increases their testability.
One of the nicest tools in recent years for assisting with the testing process is NUnit. A .NET version of the JUnit, NUnit allows for the creation of a test suite. The test suite consists of a number of procedures, each of which is designed to test one element of an application or service. NUnit allows the suite to be run whenever required and keeps track of which procedures succeed and which fail. My experience with NUnit is that it makes the process of developing a test suite incredibly simple. Simple enough that, even with my normal reticence for testing, I no longer had an excuse not to create and use unit tests.
The fact that services publish an interface makes them perfect for NUnit. The test suite for a particular service knows exactly what methods are available to call. It is usually quite easy to determine what the return values should be for a particular set of parameters. This all leads to an ideal environment for the 'black box' testing that NUnit was developed for. And whenever a piece of software can be easily tested in such an environment, it inevitably results in a more stable and less bug prone application.
There is another side benefit that arise both from the increased testability (and tools such as UNint) and the division of components into layers. Since the only interaction that exists between services is through the published interfaces, the various services can be developed in parallel. At a high level, so long as a service passes the NUnit test for its interface, there is no reason why it would break when the various services are put together. As such, once the interfaces have been designed, development on each of the services can proceed independently. Naturally, there will need to be some time built into the end of the project plan to allow for integration testing (to ensure that the services can communicate properly and that there was no miscommunication during development). But especially for agressive project timelines, parallel development can be quite a boon.
This particular benefit refers to the ability for a service to improve its response time without impacting any of the calling applications. It is an offshoot of the location transparency provided by SOA. Since the calling application doesn't need to know where the service is implemented, it can easily be moved to a 'beefier' box as the demand requires. And if the service is offered through one of the standard communication protocols (such as HTTP), it is possible to spread the implementation of the service across a number of servers. This is what happens when the services is place on a web farm, for example. The reasons and techniques for choosing how to scale a service are beyond the scope of this article (although they will be covered in a future one). It is sufficient, at this point, to be aware that they are quite feasible.
Location transparency also provides for greater levels of availability. Read the section on scalability and web farms for the fundamental reason. But again, the requirement to utilize only published interfaces brings about another significant benefit for SOA.
Building Multi-Service Applications
I've always described this benefit as the “Star Trek Effect“. The idea can be seen in many Star Trek episodes, both new and old. In order to solve an impossible problem, the existing weapon systems, scanning systems and computer components need to be put together in a never-before seen configuration.
“Cap'n. If we changed the frequency of the flux capacitor so that it resonated with the dilithium crystals of that alien vessel, our phasers might just get through their shields. It's never been done before, but it might work. But it'll take me 10 seconds to set it up“
Try reading that with a Scotish accent and you'll get my drift. Here are people working with services to the level that no programming is required to be able to put them together into different configurations. And it is the goal that most service-oriented architects have in mind, either consciously or subconsciously, when they design systems. And, although we are not yet at the Star Trek level of integration and standardization, SOA provides a good start.
And that's all SOA is, after all. A good start. But it is a start that is necessary to gain the benefits I've just run through. Sure it takes a little longer to design a service-based application. Not to mention creating all of the infrastructure that needs to surround it, assuming that it will be production quailty. However in the long run, the benefits start to play out.
Now I'm not naive enough to believe that we haven't heard this song before. It is basically the same song that has been playing since object-oriented development was first conceived. We heard it with COM. Then it was DCOM. Now it's web services and SOA. Why should be believe that this time will be different from the others?
I can't offer any reason in particular in answer to that very astute question. Honestly, SOA feels different than COM and OO. Perhaps that is because we've already seen the problems caused by COM, CORBA, etc and they are addressed by web services. Perhaps it is because there is now a common language that can be used across platforms (XML). Perhaps it is the level of collaboration between the various vendors regarding service-based standards. But to me at least, all of these items make me much more optimistic that this time we are heading down the right path.
I would like to wish everyone a happy holiday season.
I've spent the last couple of days working on a component that builds an RSS feed using Visual Source Safe data. The purpose for the feed is two-fold. First, the feed will be used by a nightly build to determine if anything has been updated since the previous build. Second, the feed can be subscribed to by others on the development team, allowing them to keep up to date on the progress of the project. Over the next week or so, I'll be blogging more about the problems/issues/resolutions that I've run into.
The first element of the technique that I'm using has to do with identifying the VSS project that is to be used as the source for the feed. Because it fits nicely with the VSS hierarchy (and because I thought it would be cool), we decided to include the VSS path in the URL. For example, a URL of http://localhost/VSSRSS/top/second/third/rss.aspx would retrieve the history information for the project $/top/second/third. To accomplish this, the RewritePath method associated with the HttpContext object is used.
So much for a description. My reason for the post is not to describe the technique so much as to identify an error that I received and provide a solution that is not immediately obvious from the message. While going through the development process, my first thought was to include the host information in the rewritten path. Including the http:. Why? Beats me. It was just the first thing that I did. But when I did, a page with the following error appeared:
Invalid file name for monitoring: 'c:\inetpub\wwwroot\VSSRSS\http:'. File names for monitoring must have absolute paths, and no wildcards
This interesting error is actually caused by the inclusion of the protocol (the http) in the rewritten path. The actual value of the path parameter should be relative to the current directory. Which also implies that the rewritten path cannot be outside of the current virtual directory.
Hopefully this little tidbit will help some else down the road.
VSLive! Spring 2004 Toronto
No details announced yet - but this is promising.
I don't normally like to have my blog entries simply point to another URL, but in this case, I have to make an exception. Make sure that, as well as admiring the code, you read the rationale behind the submission. And who says techies aren't creative. And that we don't have time to spare. ;)
Yesterday I tried to rename a DataSet. To lazy to recreate it. Have you ever tried to rename a DataSet?
It seems easy enough. Press F2 in the Solution explorer and type in a new name. Open the XSD and do a find and replace of the old name with the new name, there done right?
Not exactly when you try to generate the class it may get generated with the old name. Your solution ends up looking like this:
So what is the solution? The problem is in the Project File. There's an attribute on the file node called LastGenOutput it is holding the old DataSetName. See below.
Delete this attribute and you will be fine.
RelPath = "MyNewDataSet.xsd"
BuildAction = "Content"
Generator = "MSDataSetGenerator"
LastGenOutput = "MyOldDataSet.cs"