Sometimes, Little Things Matter–Azure Queues, Poor Performance, Throttling and John Nagle

Sometimes it amazes me how much of a polyglot that developers need to be to solve problems. Not really a polyglot, as that actually relates to learning multiple languages, but maybe a poly-tech.

Allow me to set the scenario. A client of ours is using Windows Azure Queue Storage to collect messages from a large number of different sources. Applications of varying types push messages into the queue. On the receiving side, they have a number of worker roles whose job it is to pull messages from the queue and process them. To give you a sense of the scope, there are around 50,000 messages per hour being pushed through the queues, and between 50-200 worker roles processing the messages on the other end.

For the most part, this system had been working fine. Messages come in, messages go out. Sun goes up, sun goes down. Clients are happy and worker roles are happy.

Then a new release was rolled out. And as part of that release, the number of messages that passed through the queues increased. By greater than a factor of two. Still, Azure prides itself on scalability and even at more than 100,000 messages per hour, there shouldn’t be any issues. Right?

Well, there were some issues as it turned out. The first manifested itself as an HTTP status 503. This occurred while attempting to retrieve a message from the queue. The status code 503 is used to indicate a service unavailable. Which seemed a little odd since not every single attempt to retrieve messages returned that status. Most requests actually succeeded.

Identifying the source of this problem required looking into the logs that are provided automatically by Azure. Well, automatically once you have turned logging on. A very detailed description of what is stored in these logs can be found here. The logs themselves can be found at http://<accountname>.blob.core.windows.net/$logs and what they showed was that the failing requests had a transaction status of ThrottlingError.

Azure Queue Throttling

A single Windows Azure Queue can process up to 2,000 transactions per second. The definition of a transaction is either a Put, a Get or a Delete operation. That last one might catch people by surprise. If you are evaluating the number of operations that you are performing, make sure to include the Delete in your count. This means that a fully processed message actually requires three transactions (because the Get is usually followed by a Delete in a successful dequeue function).

If you crack the 2,000 transactions per second limit, you start to get HTTP 503 status codes. The expectation is that your application will back off on processing when these 503 codes are received. Now the question of how an application backs off is an interesting one. And it’s going to depend a great deal on what your application is doing.

From my perspective, one of the most effective ways to handle this type of throttling is to redesign how the application uses queues. Not a complete redesign, but a shift in the queues being used. The key is found in the idea that the transactions per second limit is on a single queue. So by creating more queues, you can increase the number of transactions per second that your application can handle.

How you want to split your queues up will depend on your application. While there is no ‘right’ way I have seen a couple of different approaches. The first involved creating queues of different priorities. Then the messages being pushed into the queues can be done based on the relative priority.

A second way would be to create a queue for each type of message. This has the possibility of greatly increasing the number of queues. There are a number of benefits. The sender of the message does not have to be aware of the priority assigned to a message. They just submit a message to the queue with no concerns. That makes for a cleaner, simpler client. The worker is where control of where the priority lies. The worker can be pick and choose which queues to focus on based on whatever priority logic the application requires. This approach does presume that it’s easier to update the receiving workers then the clients, but you get the idea.

Nagling

Now that the 503 messages were dealt with, we had to focus on what we perceived to be poor performance when retrieving messages from the queue. Specifically, we found (when we put a stop watch around the GetMessage call) that it was occasionally taking over 1000 milliseconds to retrieve the message. And the median seemed to be someplace in the 400-500 millisecond. This is an order of magnitude over the 50 milliseconds we were expecting.

This source of this particular problem was identified in conversation with a Microsoft support person. And when it was mentioned our collective response was ‘of course’. The requests were Nagling.

Some background might be required. Unless you are a serious poly-tech.

Nagle’s Algorithm is a mechanism by which the efficiency of TCP/IP communication can be improved. The problem Nagle addresses is when the data in the packets being sent are small. In that case, the size of the TCP header might actually be a very large percentage of the data being transmitted. The header for a TCP package is 40 bytes in size. If the payload was 5 or 10 bytes, that is a lot of overhead.

Nagle's algorithm combines these small outgoing messages into a single, larger message. The algorithm actually proscribes that as long as there is a sent packet for which the sender has received no acknowledgment from the recipient, the sender should keep combining payloads until a full packet’s worth is ready to be sent.

All of this is well and good. Until a sender using Nagle interacts with a recipient using TCP Delayed Acknowledgements. With delayed acknowledgements, the recipient may delay the ACK for up to 500ms to give the recipient a change to actually include the response with the ACK packet. Again, the idea is to increase the efficiency of TCP by reducing the number of ‘suboptimal’ packets.

Now consider how these two protocols work in conjunction (actually, opposition) with one another. Let’s say Fred is sending data to Barney. At the very end of the transmission, Fred has less than a complete packet’s worth of data to send. As specified in Nagle’s Algorithm, Fred will wait until it receives an ACK from Barney before it sends the last packet of data. After all, Fred might discover more information that needs to be sent. At the same time, Barney has implemented delayed acknowledgements. So Barney waits up to 500ms before sending an ACK in case the response can be sent back along with the ACK.

Both sides of the transmission end up waiting for the other. It is only the delayed acknowledgement timeout that breaks this impasse. And the result is the potential for occasionally waiting up to 500ms for a response to a GetMessage call. Sound familiar? That’s because it was pretty much exactly the problem we were facing.

There are two solutions to this problem. The first, which is completely unrealistic, is to turn off TCP delayed acknowledgments in Azure. Yeah, right. The second is much, much easier. Disable Nagle’s Algorithm in the call to GetMessage. In Azure, Nagle is enabled by default. To turn it off, you need to use the ServicePointManager .NET class.

CloudStorageAccount account = CloudStorageAccount.Parse(connectionString);
ServicePoint queueServicePoint =
  
ServicePointManager.FindServicePoint(account.QueueEndpoint); queueServicePoint.UseNagleAlgorithm = false;

So there you go. In order to be able to figure out why a couple of issues arose within Azure Queue Storage, you needed to be aware of HTTP status codes, the throttling limitations of Azure, queue design, TCP and John Nagle. As I initially started with, you need to be a poly-tech. And special thanks to Keith Hassen, who discovered much of what appears in this blog post while in the crucible of an escalating production problem.

Taking the SuggestedValues rule one step further

Have you ever created a custom field on a TFS work item, that you wanted to be free form entry but save the entries so the next person can select from the previous entries.

You could write a custom control to do this also. However having a service in the back ground to manage this at the server is much easier. And does not have to be installed on each client. You first need to create a Web Service that will subscribe to TFS’s Bissubscribe.exe. There is plenty of information out there to show you the mechanics of this. Check out Ewald Hofman’s blog for the details on creating a web service to subscribe to TFS. It’s an old post but still useful, easy to understand and follow.

As an example, let’s assume the field we want to do this for on is called Requested By. Where users can select from the Developers or Business User Security groups or enter a name that is not a member of a group in TFS at all. To solve this problem we created a GlobalList called RequestedBy. Then we added a SuggestedValues rule to the field that included the Developers and Business Users groups, as well as the RequestedBy GlobalList.

The field definition looks like this.

<FIELD name="Requested By" refname="RequestedBy" type="String">
    <SUGGESTEDVALUES>
        <LISTITEM value="[Project]\Developers" />
        <LISTITEM value="[Project]\Business Users" />
        <GLOBALLIST name="RequestedBy" />
    </SUGGESTEDVALUES>
    <REQUIRED />
</FIELD>

 

If the user enters a value into the field that is not from one of the TFS groups or the globalist the web service kicks in and adds the value to the globalist. So the next user that enters that name will find them in the list and is less likely to spell the name differently than the first person.

And here is the code in the web service that accomplishes that task.

public void AddToGlobalList(WorkItemStore workItemStore, string globalList, string value)
{
    if (!string.IsNullOrWhiteSpace(value))
    {
        var globalLists = workItemStore.ExportGlobalLists();
        var node = globalLists.SelectSingleNode(
    string.Format("//GLOBALLIST[@name='{0}']/LISTITEM[@value='{1}']", globalList, value));

        if (node == null)
        {
            node = globalLists.SelectSingleNode(
        string.Format("//GLOBALLIST[@name='{0}']", globalList));
                    
            if (node != null)
            {
                var valueAttr = globalLists.CreateAttribute("value");
                valueAttr.Value = value;
                var child = globalLists.CreateElement("LISTITEM");
                child.Attributes.Append(valueAttr);
                node.AppendChild(child);
                workItemStore.ImportGlobalLists(globalLists.DocumentElement);
            }
        }
    }
}

Microsoft Test Manager – Test Impact

I just had to share this, Test Impact at it’s finest.

Working with a client on a brand new application that has Test Impact coming through as an angel on our shoulders. The recommended tests are really our regression tests, no questions about it.

This is a great example of the importance of automated unit tests for legacy code. The time spent up front can save time and money down the road. There are different ways to get test impact working like this for any project. It will take initiative and creative thinking.

Samples of what we are getting in Test Impact.

Build Test Impact Summary: image

 

Click Code Changes and see this:

image

 

 

 

 

 

 

 

 

 

 

 

Example of the Compare Changes image

 

 

 

 

 

 

 

Recommended Tests Cases in Test Manager

SNAGHTML1bf0c19

 

 

 

 

 

 

 

 

Testa Smile

Microsoft watch a live stand-up and parking lot meeting with the TFS Agile Product team

See how Microsoft’s TFS Agile Team do their scrum stand-up and parking lot meetings  Short video is the stand-up – Long video is the parking lot meeting.

I like the use of the Agile Board it is a nice visual that is missing in the standard stand-up meetings in most companies.

Scrum Stand Up

Another interesting video on using business value in a scrum project

Business Value in Scrum

Testa Smile

IIS Express Default Settings

On occasion when I open a Web application in Visual Studio, I receive a message that is similar to the following:

image So that the search bots can find the text, the pertinent portion reads “The following settings were applied to the project based on settings for the local instance of IIS Express”.

The message basically says that the settings on the Web application with respect to authentication don’t match the default settings in your local IIS Express. So Visual Studio, to make sure that the project can be deployed, changes the Web application settings. Now there are many cases where this is not desirable and the message nicely tells you how to change it back. What is hard to find out is how to change the default settings for IIS Express.

If you go through the “normal” steps, your first thought might be to check out IIS Express itself. But even if you change the settings for the Default Web Site (or any other Web Site you have defined), that’s not good enough.

Instead, you need to modify the ApplicationHost.Config file. You will find it in your My Documents directory under IISExpress/Config. In that file, there is an <authentication> section that determines whether each of the different authentication providers is enabled or disabled. If you modify this file to match your Web application’s requirements, you will no longer get that annoying dialog box popping every time your load your Solution. Of course, you *might* have to changed it for different projects, that’s just the way it goes.

DevTeach Redux

Some of you might not be aware of it, but one of the premier development conferences is coming to Toronto in a few weeks (May 27-31). That conference would be DevTeach.

For the past 10 years, DevTeach has been bringing some of the best speakers from North America to Canada to talk about the thing that we’re most passionate about: development. You will hear topics covering a wide ranges of subjects, from Agile to Cloud, Mobile to Web development, SharePoint to SQL Server. If you are interested in hearing some of the most engaging and knowledgeable speakers, then DevTeach is the place to be.

In an earlier blog post, I mentioned that ObjectSharp will be out in force for the conference. Since then, we have added more speakers to the roster. Max Yermakhanov will be speaking on Hybrid Cloud and Daniel Crenna expounds on globalization in Web applications. Max is ObjectSharp’s resident IT guru. He is responsible for the fact that ObjectSharp’s infrastructure is as cloud-y as it can be. So he brings with him real-world experience related to seamlessly weaving Azure and on premise infrastructure.

Daniel is relatively new to ObjectSharp but not to the world of .NET. A former Microsoft MVP, he is responsible for a number of open source projects, including TweetSharp. His session on globalization in Web development will touch on the stuff that only comes up when you’ve gone through the crucible of actual implementation. And being in Canada, it comes up quite frequently.

ObjectSharp has been a sponsor and champion for DevTeach since its very early days. This year, the timing of the conference would have conflicted with our annual At the Movies event. So we put off At the Movies for a year. Because that’s how good this conference is.

So if you have been to one of our At The Movies events in the past, then I strongly suggest you look at DevTeach instead. Don’t worry…we’ll go back to doing At The Movies next year (it’s too much fun for us to stop). But until then, DevTeach is the place you should be at to hear the latest and greatest in development talks.

The New Model for Office and SharePoint Apps

As I write this blog entry, I’m flying to Atlanta to give the last of 13 seminars on the new App model that is available for Office 2013 and SharePoint 2013. I have taught this material to people all over North America, as well as in Paris. As a result, I have talked to a large number of people not only about the model, but also about their plans for it. This gives me a fairly unique perspective into how people are taking the new model, as well as how it will be adopted over the next 6-9 months.

What is “The New App Model”

In a nutshell, the new app model is conceptually similar for both Office 2013 and SharePoint 2013. The basic idea is that the application no longer needs to be installed on the client’s machine (or in the SharePoint farm). There is no assembly that is deployed onto the user’s system. Instead, a manifest, in the form of an XML file, is made accessible to the client software. This manifest file includes, among other things, a URL where the application lives. And that application interacts with your client software (whether it be Office or SharePoint) through a combination of JavaScript and server-side code. That’s right, the App Model allows you to create Web sites that are hosted any place you want, but appear to run inside of Office/SharePoint. And, by the way, the Web site can be constructed using any technology you want. There is no requirement that the site use the Microsoft stack. If you’re rather create Web applications using LAMP, that’s all fine and good in this model.

What’s the Benefit?

Well for the Apps for Office model, the benefit is that you don’t have to wrestle with VSTO or MSIs to be able to deploy your applications. There is (more or less) no administrative permissions required to install an application. And there is now an Office Store where users can search for and install your application. So your ability to reach more potential clients is much higher.

For the Apps for SharePoint model, there is no need for sandbox solutions. This is not to say that you still can’t write sandbox (or farm) solutions. You can. And they still have all of the same limitations that those applications had in SharePoint 2010. But the guidance is that they should no longer be needed. The client side object model (CSOM) has been expanded to the point where farm solutions are probably not required. And if you are working in a shared hosting environment (that’s everyone is SharePoint Online, as well as a number of clients of ours), then you can be freed from the limitations of the sandbox.

And What are the Problems?

The biggest problem is that, because the model is completely new, there is no compatibility with older versions of the products. This model will not work with Office 2010 or SharePoint 2010. At all. No way, no how. If you understand the details of what’s going on, you’ll understand why this limitation exists. But the practical impact is that your only audience for any app you write and want to sell is new users. In the corporate world, this could be a few years off. For SharePoint Online, it’s a little closer, as the back-end functionality is in the process of being converted, with the user interface to be upgraded over the next 12-18 month.

Along with the need to have users on the latest version, the capability of the interface with the software seems to be a little lacking in certain areas. I found this to be particularly true in the Apps for Office model. A number of people had interesting ideas for Word or Excel applications and their first choice for a user experience ran aground on the shoals of missing capabilities. For instance, there is no way to retrieve or modify the format for a particular cell. Nor is there the ability to have the app set the currently selected cell. Is this a critical lack of functionality? Possibility. But I also know a number of people who are on the development team and they are eager to address holes in the functionality, especially if there is a compelling story around the request.

Is It Worth Using?

I think that quick answer is ‘yes’. Now it could be that I’m biased…I have been teaching this material for a while. But I like to think that talking to people about the model, hearing what they want to do and working through how it might be done has given me perspective. And I don’t have a history of liking a technology just because I teach it.

Again, dividing between Office and SharePoint, I believe that app model for SharePoint will be transformative. In particular, if you have a Web-based application that has nothing whatsoever to do with SharePoint, it is simple to integrate the application with SharePoint. And put it into the SharePoint Store, increasing its visibility. The model also requires that people who create SharePoint applications need to rethink their approach. Instead of being forced to utilize SharePoint as a data store (a task for which is it not particularly well suited), you can use a real database. Yea!!!!

The app model for Office is a good one in cases where it fits. At the moment, that seems to be helper applications. Dictionaries, encyclopedias, image searching. Maybe an application that can perform calculations based on the data in the document. But at the moment, there do seem to be some pieces of functionality that I’d like to see put in place. And the model is so different from how users typically use Word/Excel that I can see it taking a little bit of time to see mass acceptance.

If you have any experience with the app model, either with Office or SharePoint, I’d like to hear how it went. What type of applications have you created? Was there missing functionality that you had to work around? I’m done with the teaching tour, but I’d still like to keep in touch with how people use the model.

ObjectSharp at DevTeach

If you are an aficionado of conferences, then odds are pretty good that you have already aware that DevTeach is coming to Toronto at the end of May (May 27-31, to be precise). If you have not attended, heard of or thought about DevTeach, then you’re in for a treat.

DevTeach is a conference. For developers. By developers. If you want to learn about the latest technology, then DevTeach is the place to be. This is true whether you are interesting in developing apps, using and administering SQL Server, or working with the latest mobile technology. You will hear from industry experts, people from not only all over North America, but also locals you can chat with afterwards. And when it comes to networking, there are few conferences that offer the opportunity to hang with as many of the best and brightest.

At ObjectSharp, we are proud to be a supporter of DevTeach. And we are lucky enough to have a list of associates who are knowledgeable enough to be able (and generous enough to be willing) to share their insights and experience with others. The following is a list of the sessions that are led by one of our own. If this list isn’t enough to entice you, then check out the full schedule here. Or you can just trust me and sign up here. Take advantage of the fact that all of this talent is within your reach to hear from and talk to.

clip_image002Colin Bowern

Designing with ASP.NET MVC and Web API – Tues, May 28

The State of (Corporate) HTML5 – Wed, May 29

Managing a Cross-Platform Code Base – Wed, May 29

Handing Identity Management for SaaS Apps – Thurs, May 30

clip_image004Bruce Johnson

var WebDeveloper = new OfficeAndSharePointAppDev; – Wed, May 29

Using Hybrid Solutions in Windows Azure – Thurs, May 30

Advanced Windows Phone 8 (full day, pre-conference session) – Mon, May 27

clip_image006Atley Hunter

Building Mobile Experiences that Don't Suck – Wed, May 29

HackTeach – Wed, May 29

MTM Test Suite–add requirement

One option in Test Manager for Test Suite is “Add Requirement” which adds the Requirement ID and it’s title as a test suite. Example below:

image

 

 

What happens if at a later date someone goes in and changes the title of the PBI?

First the change in the work item to the title does not show in your Test Suite. What has been added to Test Manager is an object on it’s own not a link to the actual work item. Think of it as a folder for tests related to your requirement.

What can I do?

There is the delete/add option however you will lose all test results associated to your test suite. When you delete a suite all test points contained within the suite are deleted. I would only use this when test execution has not happened yet for the test suite.

The rename option on a test suite can be used. Select test suite > right-click > Rename. In this option I would copy from the actual work item so the titles match.

How do you know there has been a change?

Often you don’t without someone telling you or finding by accident or creating a query to compare with. I’d like an alert that tells me when a Requirement title has changed and the Iteration Path. Both of these can affect the Test Plans and their Suites.

This happens in both MTM2010 and MTM2012.

Testa Smile

 

Welcome to the Author Fold

For most people, the idea of writing a book is a daunting one. There is little that scares people more than a blank page and the need to put 10,000 words onto it in the next 30 days or so. I believe that death and public speaking might be higher on the list, but only by a little. So the gumption it takes to put together a book proposal, submit it to a publisher, write all of those words, suffer with editors and technical editors making comments and finally get to the point where it’s it published is a big deal. For that reason, I’d like to celebrate two of my ObjectSharp colleagues, Lori Lalonde (@LoriBLalonde) and David Totzke (@VisualDragon) who now have a publication date for their book, Windows Phone 8 Recipes.

I know that it’s a thankless journey, but allow me to offer up my appreciation for your contribution to the world of technical literature. As for the rest of you, you can show your appreciation by going here and buying a copy. It is currently on pre-order with a scheduled publication date of June 26th, but you can buy an alpha copy of the book and get access to the wonders that are inside right now.