The Memory Reclamation Phase

The project that I have been working on for the past few months has now entered it's performance tuning phase.  This means that code is being gone over in great detail, looking for areas where speed gains can be realized.  Because there is a high concentration of former C++ developers here (the project is done in C#), I have been getting a number of questions regarding the allocation and deallocation of memory.  Questions like when does it happen and how can memory be reclaimed faster. 

Whether you are coming from a VB or C++ background, you should be aware of the allocation and garbage collection process in .NET.  For no other reason than the things you might have done for optimization in the past may very will be completely wrong.  Two examples:  Setting variables to null (based on the Set var = Nothing best practice in VB) and implementing finalizers (a common memory deallocation technique in C++).  For an excellent description of the reasons behind not doing this (and other practices), check out http://weblogs.asp.net/ricom/archive/2003/12/02/40780.aspx.  And if you feel compelled to use the GC.Collect method, read http://blogs.msdn.com/ricom/archive/2004/11/29/271829.aspx.  Very useful information indeed.

The Case for Unit Testing

A posting on Roy Osherove's blog taking issue with a post by Joel Spolsky on the failure of methodologies like MSF got me thinking.  In particular, I (like Roy) took issue with the following section (which is actually written by Tamir Nitzan, but agreed to by Joel).

Lastly there's MSF. The author's complaint about methodologies is that they essentially transform people into compliance monkeys. "our system isn't working" -- "but we signed all the phase exits!". Intuitively, there is SOME truth in that. Any methodology that aims to promote consistency essentially has to cater to a lowest common denominator. The concept of a "repeatable process" implies that while all people are not the same, they can all produce the same way, and should all be monitored similarly. For instance, in software development, we like to have people unit-test their code. However, a good, experienced developer is about 100 times less likely to write bugs that will be uncovered during unit tests than a beginner. It is therefore practically useless for the former to write these... but most methodologies would enforce that he has to, or else you don't pass some phase. At that point, he's spending say 30% of his time on something essentially useless, which demotivates him. Since he isn't motivated to develop aggressively, he'll start giving large estimates, then not doing much, and perform his 9-5 duties to the letter. Project in crisis? Well, I did my unit tests. The rough translation of his sentence is: "methodologies encourage rock stars to become compliance monkeys, and I need everyone on my team to be a rock star".

The problem I have is that the rock stars that are being discussed are not at all like that. Having had the opportunity to work with a number of very, very good developers, I found that they embrace unit tests with enthusiasm.  And their rationale has little to do with the fact that they might be create a bug that needs to be exposed.  While that might be part of the equation, it is not the real reason for creating a unit test. 

The good developers that I've encountered are lovers of good code.  They believe that well crafted code has a beauty all of its own.  They strive to write elegant, performant classes because, well, that's what craftsmen do.  But when put under the time constraints of business, it is not always possible to create the 'best' solution every single time.  An imminent deadline might require that 'good enough' classes be checked into production.  Such is the life of a paid developer.

But you and I both know these good developers are many times more productive than their 'average' conterparts.  As a result, they frequently have time within a project schedule to refactor previously completed classes.  In fact, they enjoy going back to those 'good enough' classes and improving on them.  This attitude is one of the things I've found separates the good developers from the pack.  They are actually embarrassed by some of the 'good enough' code and feel the need to make them right.

This is where the unit tests come in.  If the completed classes are supported by a solid set of unit test, then this refactoring process can be done with a low risk to the project.  The developers know that when a modified class passes the unit tests, it is much less likely to introduce further bugs. So, rather than thinking them a waste of time, the good developers I know relish the idea of creating unit tests. Perhaps this is one of the characteristics that the rest of us would do well to emulate.

Pack Your Bags

A couple of conference announcements over the past few days help put my travel plans for next year on more solid footing.

VSLive is coming back to Toronto from April 13-16.  This time around, the conference will be held much closer to the heart of Toronto.  The Westin Harbour Castle is right on the lake and an easy walk from all of the downtown attractions.  Will probably make the ObjectSharp party bus less logistically challenging. Check out www.vslive.com/2005/to for more details.

I saw via Julia Lerman's blog that Microsoft has scheduled PDC 2005 to be in Los Angeles in Sept. Would rather it be some time when travelling someplace warmer would be more beneficial (like, say, the winter), but I'll take what I can get.

Improving the Performance of Asynchronous Web Service Calls

Notice that the name of this post doesn't say that web service performance can be improved through asynchronous calls.  That is on purpose.  This particular post deals with a limitation that impacts applications that utilize async web service methods.  Actually, it has the potential to impact any application that interacts with the Internet.  This is actually one time I wish my blog was more widely read, because I can pretty much guarantee that there are thousands of developers who are unaware that the nice async design they've implemented isn't having the performance boosting effect that they expected.  And thanks to Marc Durand for pointing this out to me.

The limitation I'm talking about is one that is buried in the HTTP/1.1 specification (RFC2616).  The spec says that, to prevent a single computer from overrunning a server, the limit to the maximum number of connections that can be made to a server is two.  What this means is that if your application makes three async calls to the same server, the third call will be blocked until one of the first two is finished.  I know that this came as a big surprise to me.

Fortunately, there is a configuration setting that can adjust this number without requiring any coding changes.  In the app.config (or machine.config) file, add a connectionManagement tag.  Within connectionManagement, the add tag is used to specify the optional server and the maximum connections allowed.  The following example allows up to 10 simultaneous connections to be made to 216.221.85.164 and 40 to any server.

<configuration>
  <system.net>
    <connectionManagement>
      <add address="216.221.85.164" maxconnection="10" />
      <add address="*" maxconnection="40" />
    </connectionManagement>
  </system.net>
</configuration>

For those who like to program, you can accomplish the same behavior on a temporary basis using the ServicePointManager.DefaultConnectionLimit property.  But I was never big into adding code when I didn't have to.

Dogs and cats, living together. Mass hysteria.

For a while now, this has been my catch phrase for situations where stuff is about to hit the fan.  Thanks to a colleague, it turns out that the end really is closer than I thought.

http://tv.ksl.com/index.php?nid=5&sid=135591

For the culturally stiffled, the quote is from Ghostbusters.

FileNotFoundException in ASP.NET Web Services

As you work with web services in ASP.NET, it is likely that you will eventually come upon a weird file not found exception.  The actual text of the exception will look something like “File or assembly name jdiw83ls.dll, or one of its dependencies, was not found".  Now, since it unlikely that you have a project with a name of jdiw84ls in your solution, the source of this exception and it's resolution will be hard to see.

To understand where the message is coming from requires a little detail about how ASP.NET services moves complex objects across its boundary.  If you pass a complex object to a web service, the client side serializes the class into an XML document.  On the receiving side (the server), the inbound XML document is deserialized back into a class.  This knowledge is part of Web Services 101.

However, what happens when the class

<configuration>
    <system.diagnostics>
         <switches>
            <add name="XmlSerialization.Compilation" value="4"/>
         </switches>
    </system.diagnostics>
</configuration>

 

Web Services and Proxy Servers

I'm currenting working with a team that is putting the finishing touches on a commercial application that uses web services as a major component of the infrastructure.  As part of putting the client portion through its paces, an environment that included a proxy server was tested.  The initial result was not surprising to anyone who has significant experience with web services:  a connection could not be made exception was thrown. The reason for this problem is that the method for defining a proxy server to the web service proxy class on the client side is a little unexpected.  And I apologize in advance for having to use the same word (proxy) to describe to separate items (the server and the web service class). 

If you are running Internet Explorer, then the place where the the proxy server settings are configured is in the Tools | Internet Options dialog.  Within the dialog, click on the Connections tab and the LAN Settings button.  Here you can specify whether proxy server settings are required and which server and port are used.  You can also identify whether local addresses are exempt from the proxy process.  This is fairly well known information, especially if you normally operate from behind a proxy server.

The disconnect occurs when you make calls to web services from this same machine.  Under the covers, the HttpWebRequest class is used to transmit the call to the desired server.  However, even when the proxy server information has been specified as just described, those details are NOT used by HttpWebRequest.  Instead, by default, no proxy is used to handle the request.  This is what causes the connection exception. I know the first time I ran into this issue was a surprise to me.

There are two possible solutions to this problem.  The first is to assign a WebProxy object to the Proxy property on the web service proxy class.  You can programmatically specify the proxy server information, but that is very difficult to modify.  Instead of creating the WebProxy object on the fly, use the GetDefaultProxy static method on the System.Net.WebProxy object.  This method reads “the nondynamic proxy settings stored by Internet Explorer 5.5” and returns them as a WebProxy object that can be assigned to the Proxy property .  This technique works fine assuming that your client has IE installed and that none of the scripts run at login time have modified the proxy settings.

There is an alternative that is a little more flexible.  There is a app.config element called defaultProxy that is part of the System.Net configuration section.  Within this element, the details of the proxy server can be specified independently of the IE settings.  The basic format of the defaultProxy element can be seen below.

<system.net>
   <defaultProxy>
      <proxy
         proxyaddress = "http://proxyserver:80"
         bypassonlocal = "true"
      />
   </defaultProxy>
</system.net>

The proxy element also includes an attribute called usesystemdefault which, when set to true, causes the IE settings to be used.  The benefit of this approach is that no additional coding is required.  The settings defined in defaultProxy are use by HttpWebRequest (and, therefore, web service calls) with no change to the application required.

As an added benefit, this element can be added to machine.config, which allows it to be applied to every HttpWebRequest made from the client machine regardless of the application.  Keeping in mind that an uneducated developer could assign the Proxy object in code, overriding this nice, easily configurable default.

Trip to Ottawa and Requirements Traceability

I just got back from Ottawa, where last night I was speaking to the Ottawa .NET community about Visual Studio Tools for Office. (more on that later).

I wasn't surprised by the Grep Cup weekend inflated hotel rates, but I was surprised to find a “2.8% DMF Fee” on my hotel bill (on top of the 15% worth of federal and provincial taxes). Upon request, I was informed that this was a “Destination Marketing Fee” which goes to fund marketing efforts to promote tourism in the area. Various Ontario tourism centers (including Hamilton - go figure?) have been lobbying the provincial governments since post 9/11 in an effort for them to allow a tax (a real tax, not a fee) for the same purpose. This past summer however, the hotels decided that this was going nowhere so they decided to start collecting (on a voluntary basis) a fee (not a real tax).

Maybe it's just me, but I'm thinking the best way to attract people to your city is not to put a sneaky “DMF Fee” charge on those same people's hotel bill when they come to visit you and hope they don't ask about it. Even worst, because it's a fee charged by the hotel, and not a real tax - guess what - you pay tax on the DMF Fee. Icarumba! It turns out it's voluntary fee and not hotels collect it. The front desk staff sensed I was not pleased about being asked to pay for marketing fees on top of my room rate so they quickly waived the fee. But I wonder how many people willing pay this?

This all reminds me very much about requirements management and software development. Often, people, usually too close to the problem, design features into software that doesn't meet the requirements of the user. Take for example those goofy glyphs on the Lotus Notes login window. What about clippy? Is that satisfying anybody's requirements - or is it just pissing you off? With all of our best intentions, it is extremely important that we take the time to perform reality checks on what we are building against the requirements of our users.

Now to bring it all home. Do users really want to do their work in a web browser? Browsers are great for wandering and finding stuff, but do they want to see the value of their stock portfolio in a browser? You need to find the best environment for the job your users are trying to accomplish. If somebody is accustomed to using Excel to aggregate a bunch of their financial information, then maybe Visual Studio Tools for Office is the right tool for that job. While writing applications in Excel isn't exactly new, with VSTO you have the integration with the .NET Framework, Web Services, and the smart client deployment model, you can apply all professional development skills you have at your disposal to creating applications with Word & Excel. And don't worry, I have yet to see clippy show up in Visual Studio Office projects.

 

Deserializing Objects

Another post about the deserialization of objects after a web service call.  In this particular scenario, the web service method was returning an ArrayList.  On the client side, I was expecting to see a number of different types of objects.  But one of the objects, a custom class, was being returned as an array instead of an instance of the expected class.  And inside that array was an XmlAttribute and two XmlElement objects.  These were, it turned out, the XML nodes that were being returned from the web service method.  It's just that they weren't being converted into the desired object.

After a couple of hours of digging, I found out the cause.  Many of you are probably aware of the XmlInclude attribute that is used to decorate web methods to indicate that the schema for a particular class should be included in the WSDL for a web service.  That is not, strictly speaking, it's only purpose.  To be precise, it is used by the XmlSerializer to identify the types that are recognized during the serialization and deserialization process. It's this second piece that was the solution to my problem.

Because the web service being developed is intended to be used in an intranet environment, the same assembly was available on both sides of the fence.  To support this, the generated proxy class was modified to reference the shared assembly instead of using the class that is automatically included by WSDL.EXE.  It also meant that we didn't want to perform an Update Web Reference when the web service was modified...it would have meant making the manual changes to the proxy class again (Rich Turner might not remember me from the MVP Summit, but I was the one who was most excited when he described the ability to customize the proxy class generation being available come Whidbey).  The down side of this manual process is that, when objects of this type were added to the ArrayList that was returned, the proxy was not updated with the appropriate XmlInclude attribute. 

That's right.  The proxy class gets the same XmlInclude attribute.  WSDL.EXE generates the class with the appropriate attributes.  But if you're like me, then it's something you need to be aware of the next time an array of XmlElements appears unexpectedly on the client side.

Moving DataSets across Web Service Boundaries

A situation last week had me looking deep into the bowels of how DataSets are marshaled across web service methods.  The situation is as follows.

For programming ease, the typed DataSet that was automatically generated was modified to include a custom object as one of the column types.  If you've ever taken a close look at the code that gets generated when a DataSet is created from an XSD diagram, you'll realize that this is not a difficult thing to accomplish.  And, for the most part, there is little reason to modify the generated code. But that being said, there are times when doing so can make the life of the DataSet user a little easier.  And this isn't really the point of the blog, only the motivation for digging into DataSet marshaling.

The modified DataSet was being returned to the client by a web service method.  When the hacked column on the returned object was accessed on the client side, a StrongTypingException is thrown.  Another look at the DataSet generated code shows that the cause of this exception is a casting error in the property.  Further examination showed that the name of the class associated with the column was being returned.  As a string.  Naturally when the literal “This.Namespace.Class” is converted to an object of type This.Namespace.Class, casting exceptions are the result.

This behavior begged the question “why wasn't the object being XML serialized with the rest of the DataSet”.  After all, though I haven't mentioned it yet, This.Namespace.Class is completely serializable.  In particular, why was the name of the class being returned???  That didn't make any sense.

At this point, I broke out .NET Reflector and took at look at the details of the DataSet.  First of all, if a DataSet is returned across a web service boundary, it is not XML serialized in the same manner as other objects.  Instead of converting the contents of the DataSet to XML, a diffgram of the DataSet is generated.  More accurately, the WriteXml method is called with a WriteMode of Diffgram. Within the DataSet class, there is a method called GenerateDiffgram.  This method walks across each of the tables in the DataSet, followed by walks, in turn, through the DataTables and DataRows.  When it gets to an individual column, it loads up a DataStorage object.  More accurately, since DataStorage is an abstract class, it instantiates a DataStorage derived class, using the column type to pick the appropriate class.  A veritable DataStorage factory.  When the column type is one of the non-intrinsic types, the ObjectStorage class is used.  On the ObjectStorage class, the ObjectToXml method is called.  This is where the unexpected happens. At least, it was unexpected for me.  The ObjectToXml method does not XML Serialize the object!

In the case of the ObjectStorage class, what actually happens is that a check is performed to see if the object is a byte array.  If it is a base64 encoding of the array is returned.  Otherwise the ToString() method on the object is called. When it comes to re-hydration on the client side, a similar process occurs.  The difference is that the XmlToObject method in the ObjectStorage class instantiates the desired object by passing the string representation of the object (as emitted by the ObjectToXml method) into the constructor.

So what is the point of all this?  First, it explains why the name of the class was appearing in the generated XML.  Unless overridden, the output of ToString() for an arbitrary class is the name of the class.  It also explains why no object was being created on the client side, as the class I was working with didn't have the appropriate constructor.  My solution, which I freely admit is a bit of a hack, is to give the diffgram generation process what it's looking for.  I overloaded ToString() to return an XML document containing the values of the class, a poor man's version of the WriteXml method that is part of the IXmlSerializable interface.  I also created a constructor that took a string as a parameter and repopulated the properties of the class (the ReadXml portion of the process).  Problem solved.  But still, I have to wonder why ToString was used and not the IXmlSerializable interface methods in the first pace.  Here's hoping someone more knowledgeable than me will provide some insight.