And there is no escape...
I ran into an interesting issue with JSON.NET over the weekend. Specifically, while I was serializing an object, it would fail silently. No exception was raised (or could even be trapped with a try-catch). However, the call to Serialize did not return and the application terminated.
The specific situation in which the call to Serialize was being made was the following:
Task creationTask = new Task(() =>
_customers = new List<Customer>();
// Do stuff to build the list of customers
Now the actual call to JsonConvert.Serialize is found in the serializeCustomer method. Nothing special there, other than the method that actually fails. But the reason for the failure is rooted in the code snippet shown above.
This was the code as originally written. It was part of a WPF application that collected the parameters. And it worked just fine. However the business requirements changed slightly and I had to change the WPF application to a console app where the parameters are taken from the command line. No problem. However while there was a good reason to run the task in the background with a WPF application (so that the application doesn’t appear to be hung), that is not a requirement for a console app. And to minimize the code change as I moved from WPF to Console, I changed a single line of code:
Now the call to JsonConvert.Serialize in the serializeCustomer method would fail. Catastrophically. And silently. Not really much of anything available to help with the debugging.
Based on the change, it appears that the problem is related to threading. Although it might not be immediately obvious, the ContinueWith method results in the creation of a Task object. This process represented by this object will be executed in a separate thread from the UI thread. So any issues that relate to cross-threading execution has the potential to cause a problem. I’m not sure, but I suspect that was the issue in this case. When I changed the code to be as follows, the problem went away.
Task creationTask = new Task(() =>
_customers = new List<Customer>();
// Do stuff to build the list of customers
Now could I have eliminated the need for the Task object completely? Yes. And in retrospect, I probably should have. However if I had, I wouldn’t have had the material necessary to write this blog post. And the knowledge of how JsonConvert.Serialize operates when using Tasks was worthwhile to have, even if it was learned accidentally.
As 2013 came to a close, I put the wraps on my latest book (Professional Visual Studio 2013). While I’m not quite *done* done, all that’s left is to review the galleys of the chapter as they come back from the editor. Work, yes. But not nearly as demanding as everything that has gone before.
As well, since I’ve now published four books in the last 25 months, I’m a little burned out on writing books. I’m sure that I’ll get involved in another project at some point in the future, but for at least the next 6 months, I’m saying ‘no’ to any offer that that involves writing to a deadline.
Yet, the need to write still burns strongly in me. I really can’t *not* write. So what that means is that my blogging will inevitably increase. Be warned.
To start the new year, I thought I’d get into an area that I’m moderately familiar with: Cloud Computing. And for this particular blog, it being the start of the year and all, a prediction column seemed most appropriate. So here we go with 5 Trends in Cloud Computing for 2014
Using the Cloud to Innovate
One of the unintended consequences of the cloud is that it sits at the intersection of the three big current technology movements: mobile, social and big data.
- Mobile is the biggest trend so far this century and is becoming as significant as the Internet itself did 20 years ago. The commoditization of the service is well underway and smartphones need to be considered in almost every technology project.
- Social is not at the leading edge of mind share any more. And definitely not to the same level it was a few years ago. It it quickly becoming a given that social, of some form or another, needs to be a part of every new app.
- Big Data is the newest of these three trends. Not that it hasn’t been around for a while. But the tools are now available for smaller companies to be able to more easily capture and analyze large volumes of data that previously would have simply been ignored.
What do these three trends have in common? They all use (or can use) the cloud as the delivery mechanism for their services. Most companies wouldn’t think of developing a turnkey big data environment. Instead, they would use a Hadoop instance running in Azure (or AWS or pick your favorite hosting environment). And why build an infrastructure to support mobile apps until you really need to roll your own. Instead, use the REST-based API available through Windows Azure Mobile Services. It has become very easy to use the cloud-available services as the jumping off point for your innovation across all three of these dimensions. And by allowing innovators to focus more on their creations and less on the underlying infrastructure, the pace and quality of the innovations will only increase.
Hybrid-ization of the Cloud
Much as some might want (and most don’t), you cannot move every piece of your infrastructure to the cloud. Inevitably, there is some piece of hardware that needs to be running locally in order to deliver value. But more importantly, why would you want to rip out and migrate functionality that already works if such a move provides little or no practical benefits. Instead, the focus of your IT group should be on delivering new value using cloud functionality, transitioning older functions to the cloud only on an as-needed basis.
What this does mean is that most companies are going to need to run a hybrid cloud environment. Some functions will stay on-premise. Others will move to the cloud. It will be up to IT to make this work seamlessly. There are already a number of features available through Azure AD to assist with authentication and authorization. But as you go through the various components of your network, there will be many opportunities to add to the hybrid portion of your infrastructure. And you should take them. The technology has gotten to the point that *most* issues related to creating an hybrid infrastructure have been addressed. Take advantage of this to make the most of the interplay between the two environments.
Transition from Capitalization to Expenses
For most people, the idea of using the cloud in their business environment is driven by the speed with which technology can be deployed. Instead of needing to wade through a budget approval process for a new blade server, followed by weeks of waiting for delivery, you can spin up the equivalent functionality in a matter of minutes.
But while that capability is indeed quite awesome, for business people it’s not really the big win. Instead, it’s the ability to move the cost associated with infrastructure from the balance sheet to the income statement. At the same time as this (generally) beneficial move, the need to over-purchase capacity is removed. Cloud computing allows you to add capacity on an as-needed basis. While it’s not quite like turning on a light switch, it’s definitely less onerous than a multi-week purchase/install/deploy cycle that is standard with physical hardware. One can question whether the cost of ‘renting’ space in the cloud is more or less expensive that the physical counterpart, but the difference in how the costs are accounted for make more of a difference than you think.
So how does this impact you in 2014? More and more, you will need to be aware of the costing models that are being used by your cloud computer provider. While the costs have not yet become as complicated as, say, the labyrinth of Microsoft software licensing, they are getting close. Keep a close eye on how the various providers are charging you and what you are paying for, so that as you move to a cloud environment, you can make the most appropriate choices.
In order to be successful, your application needs to leverage connections between a wide variety of participants: users, partners, suppliers, employees. This is the ‘network’ for your organization. And, by extension, the applications that are used within your organization.
If you want to maximize the interconnectedness of this network, as well as allowing the participants to take full advantage of your application, you need to provide two fundamental functions: a robust and useable API and the ability to scale that API as needed.
In most cases, a REST-based API is the way to go. And you will see in the coming 12 months an increased awareness of what makes a REST API ‘good’. This is not nearly as simple as it sounds. Or, possibly, as it should be. While some functionality is easy to design and implement, others are not. And knowing the difference between the two is either trial and error or you find someone who has already been through the process.
As for scalability, a properly designed API combined with cloud deployment can come close to giving you that for free. But note the critical condition ‘properly designed’. When it comes to API functionality, it is almost entirely about the up-front design. So spend the necessary effort to make sure that it works as you need it to. Or, more importantly, as the clients of your API need it to.
For the longest time, real-time was the goal. Wouldn’t it be nice to see what the user is doing on your Web site at the moment they are doing it. Well, that time is now in the past. If you’re trying to stay ahead of the curve, you need to look ahead to the user’s next actions.
This is not the same as Big Data, although Big Data helps. It’s the ability to take the information (not just the data) extracted from Big Data and use it to modify your business processes. That process could be as simple as changing the data that appears on the screen to modifying the workflow in your production line. But you’ll start to see tools aimed at helping you understand and take advantage of ‘future’ knowledge start to arrive shortly.
So there you are. Five trends that are going to define cloud computing over the next 12 months, ranging from well on the way to slightly more speculative. But all of them are (or should be) applicable to your company. And the future of how you create and deploy applications.
I really need to stop ignoring my BLOG, I have lots of stuff to post, however I just keep forgetting to do it. Life gets so busy. Well it’s a new year and I am going to try and post something at least every two weeks. I want to say every week but I can’t see that happening.
Since I run the Toronto ALM user group I should at least let people know what is coming up.
In January we had the last Canadian speaking appearance of Colin Bowern when he gave a great presentation sharing his thoughts on this topic: As with many things in software engineering there is rarely an answer that is always right, all the time – except locking your workstation when you walk away from your desk, no excuses there. In the ALM space we have heavily integrated stacks like Microsoft TFS, Rational Team Concert, CollabNet TeamForge and Atlassian’s toolset, but we also have standalone tools that are focused on being the best at one thing alone. In this session we’ll walk through a particular stack of tools that can be used in .NET shops that have investments in other platforms such as PSAKE, TeamCity, xUnit, SpecFlow, Node and PowerShell. But bigger than this toolset we will compare and contrast the integrated and best-in-class approaches to make sure we understand the tool, the myth and the legends behind each. Bring your experiences and let’s have a rich discussion that will broaden our horizons on what is possible to help teams reduce friction and ship value faster.
Thanks again Colin your presentation was informative and very well received by the group.
In February we have Max Yermakhanov showing us the new Release Management Solution that comes in TFS 2013. This is Microsoft's newest acquisition from Canada’s own InCycle. Are you looking for a way to track your release process and start automating your deployments for repeatable success? Are you wanting to have automation that is the same across development, test, and even production environments? If so, come by and learn about release management tooling in TFS 2013.
Hope to see you at the meeting in February.
Recently we, ObjectSharpees, had an internal email discussion on the subject of authoring technical books. This blog post provides the feedback from many of the current @TeamObjectSharp book authors.
The questions asked in my starter email were:
- What is worth authoring a book?
- How much time did you have to spend on it?
- Is it hard to get publishers to publish your book?
- Other feedback on your experience
- Are you interested in co-authoring books?
- What is the Booking Authoring Process?
What is worth authoring a book?
@LACanuck: Yes, but only because I get intrinsic enjoyment from writing. If writing is a chore, then I can pretty much guarantee it won’t be worth it.
@VisualDragon: Intrinsically enjoying writing, be it blog or otherwise, isn’t really the same as writing a technical publication that can’t be edited once it “goes to press”. When you sign off on the galley proofs, that’s pretty much it. Get it wrong and it’s out there with your name on it forever - or at least until the first revision.
@MichaelSahota: YES! My goal was to help people in my community and improve it. I succeeded.
How much time did you have to spend on it?
@LACanuck: At the pace that I write, it takes about one hour per page, all in. By ‘all in’, I mean the initial planning, research and writing. Along with editing, creating examples, responding to technical and editors comments and reviewing the galley proofs.
@MichaelSahota: I had a lot of the book written as blog posts. I thought I would cat it together and be done, but it took a lot more effort than I thought to clean it up to make it coherent.
I RECOMMEND you start publishing your book via blog posts - then you get feedback on your ideas and content. If you want to have a valuable book, I highly recommend it.
Is it hard to get publishers to publish your book?
@LACanuck: I can’t answer this question directly. To be honest, I have publishers asking me to write for them because I stick to the deadlines that I commit to and apparently that is not very common in the industry. But if you do have an idea for a book, I’d be happy to introduce you to my editor, who is actually the Wiley acquisitions editor.
@MichaelSahota: My understanding is that most publishers are not that helpful. There is a really cool new approach to consider called "Happy Melly". But you have a lot of content to write before even worrying about publishing.
Other feedback on your experience
@LACanuck: It’s harder and more onerous than you think. I’ll repeat what I said earlier if you don’t *love* to write, you are going to find it hard slogging. If writing comes relatively easily, then it’s a fun process, albeit a grueling one. And the money is not worth it. I get an advance for all of the books that I write (about half make the advance back), but it works out to be around $15-20 an hour for my efforts.
@loriblalonde: I think it’s great that you’re interested in writing a book! If you’re passionate about a particular technology, then by all means go for it. I also think it’s a good approach to partner up with others than take on the your first book solo. I was approached by Apress to write a Windows Phone 8 book, but I do know you can approach publishers to submit your book proposal:
@VisualDragon: The research isn’t always as much fun as you might think. Depending on the technology, there may be very little documentation available to you and some of it is almost certainly going to be wrong. No surprise there. You will likely have to piece together information from many sources to get the whole picture. I used MSDN, blogs, and Stack Overflow. Some things in the Azure portal I pretty much had to document by doing. I also bought the preview copy of Windows Phone 8 Internals which is being written by members of the actual Windows Phone team and was wrong on several points. I picked up Bruce’s book on Windows Azure Mobile Services which, at least in all important aspects, was correct, however some of the information in his book was already out of date and it was only published a couple of months before I bought it due to the current state of flux that WAMS is in.
I seemed to draw the short straw with my particular chapters and ended up having to do a lot of experimentation to actually figure out how it worked for real. Lori had to pick up my slack and wrote 9 chapters of 14 instead of 7.
For example, the new Map control in WP8 says that many of the properties now support data binding. I wire it up, and nothing happens, check other sources to confirm that it does, confirm that those properties are indeed DependencyProperties think maybe it’s a timing issue with data binding due to the different object lifetime management of WP8 etc.. Finally conclude that it’s broken. My guess is that while the properties are now DependencyProperties, the code to do what it’s supposed to do when those properties change isn’t implemented.
I also managed to uncover a bug in the actual framework that’s likely been there since very early on, probably since WP7.
No. This is not “fun”. It sounds like fun As a hobby or for professional growth this might be an interesting exercise but against a deadline, not so much. And at least for me personally, I feel a deep responsibility to the material and to the reader. I am not content to just write how it’s supposed to work. I need to share how it actually works.
I think the best piece of practical advice I could give you is to create the sample application or code first. Make sure it actually does work the way the documentation says it does. I had to do a couple of not insignificant rewrites because what I wrote wasn’t how it was in reality.
@danielcrenna: Listen to Bruce. I have written two books and it nearly killed me both times. Stick to small, risk free endeavours like licensed games.
1. Low effort to sustainable value ratio (writing a tech book for the bargain bin vs. a perennial topic)
2. Overwork (might be unique but I had to write 400 pages in two months for one of my books)
3. Personal impedance mismatch (in hindsight I wrote for the wrong reasons and I had low passion for the subject matter, and I mean I didn't have an abiding, irrational love for the subject matter to sustain me through the doldrums).
4. Wrox has ancient tools for writing a book or did at the time. Having good diffs (git) and typesetting (latex and markdown) to keep your writing friction free is a lot more valuable than you might realize.
5. By the time I was asked to write a book I didn't need it for resume padding (which is the only reason you'd write a book if passion is low, it's an ego thing since it's certainly not a money thing in our space).
All that said I have had good responsiveness with publishers. I find just asking is the best strategy. I often review proposals for publishers and many people are first time authors. Best of luck, I feel it's important to be honest about the experience because everyone I talked to was honest with me. And I know if you really want to write a book nobody will be able to stop you. That in a way is the only real prerequisite you need.
@MichaelSahota: Follow your dreams. If this is what turns your crank then don't let anyone stop you. On the other hand, I also recommend understanding why you want to do this. Most of us including me are running scripts from parents/society/early-childhood experiences. Following scripts is different from following dreams.
Are you interested in co-authoring books?
@LACanuck: Not now. I have written four books in the last 26 months, so it’s time for me to take a break.
What is the Booking Authoring Process?
The process typically goes as follows, at least as far as Microsoft Press and Wiley go.
1. You create a proposal for the book. This includes a description of the contents of the book, who the audience is, what other books on a similar topic might be, a list of the chapters (including a paragraph on what is covered and an estimate of the number of pages), and other details and competitive analysis on the market. Consider this to be a summary of what the publisher is ‘buying’, so it’s goal is to convince them that the book is likely to make them money.
2. If the proposal is accepted, then you will work with your editor to create a schedule for the chapters that you laid out in the proposal. Sometimes the schedule is tight, sometimes it’s not. The one that I’m on the verge of finishing is actually quite tight. The previous couple were much looser. In general, you should expect the schedule to produce a chapter about every 10-14 days. This number is a combination of comfort for you and comfort for the publisher (I have picked up chapters and entire books from authors who went AWOL).
3. As you start to turn in chapters, they will be reviewed by a copy editor and a technical editor. First of all, don’t be shocked by the amount of corrections that are picked up by the copy editor. They know grammar and how to write good English. I don’t claim to J. The purpose of the technical editor is to make sure that the samples you provide actually compile and that the stuff upon which you speak is technically accurate. When this review process is finished, you will get the chapter back with the various edits suggested by these two people. You need to address, correct, disagree with each of the points that are raised. This is typically about an hour per chapter and needs to get turned around normally within about a week of receiving the edit. And this happens while you are continuing to write other chapters.
4. You need to review the galley proofs. This happens late in the process and consists of reviewing the actual PDF files that have been laid out, complete with the figures. Another copy editor has gone through the PDF and made additional notes (amazed me that even after three sets of eyes have seen it, there are still tense and pluralization errors that occur), so you will need to confirm that the changes don’t affect the meaning.
5. Again, towards the end of the schedule, you will be asked to write your bio, the introduction to the, review the cover art and marketing collateral, etc.
So when you figure out your schedule for the book, you need to take all of this into consideration. I can write about 20 pages of new material in my free time for a week. This is why a chapter every 10-14 days works for me. But ultimately you need to make sure that the output required by the schedule fits into the time you have allotted for writing.
If you have not embraced Teams in TFS you should take a look at them. This is a wonderful feature that makes grooming backlogs by team so easy.
Teams allow you to divide a TFS project up into products. From the TFS Control Panel in your web interface you can create a team:
I recommend you select Team Area. This will make the product backlog easier to use.
Once you have the Teams you want to assign Areas and Iterations to each team. This will give you different backlogs and sprints for each team. Select the new team in the control panel.
Create an Iteration for that team and set up the sprints/releases as children of the iteration you just created then assign them to the team by selecting them. Notice by the toolbar you must be in the control panel for this team.
Add Areas for the product under the teams area and select them for this team.
Now when you open the Web interface and select that team the backlog is filtered to only show work items for that team. It will only show this teams sprints and backlog items.
Change the view to the whole project and you will see everything for all teams again.
To switch between Teams the title bar of the TFS Web interface has a dropdown that shows the most frequently selected teams.
Select Browse all and you can switch to another teams view .
In my sample project I have many user stories, in various states.
When I switch to Fabrikam Fibers backlog. Everything is filtered for that team.
| || || || |
| || |
On December 4th, Microsoft Canada will be celebrating the launch of Visual Studio 2013 with an evening networking event for IT Development and Operations Leads. This is being held at E11even in downtown Toronto between 5 and 8pm.
Software change management is a costly and complex challenge that every customer faces. Over the last few years, our customers are increasingly sharing with us that this challenge has started to become a key blocker in their business.
With the launch of the Visual Studio 2013 wave of ALM tools, we are excited to share with you all that is new, including our Software Release Management solution. Instead of software releases being a problem to be dealt with, you’ll see real gains via consistent hand-offs and better integration between development and production. We are looking forward to hear from you and to learn more about your ALM stories.
Claude Remillard, co-founder of Montreal-based InCycle Software, will be leading this event. He’ll be talking about how a modern and automated release process can positively impact your organization, and how it can ensure a quality release process with reduced risk and quick roll back capabilities, all adding up to shorter release cycles – and fewer headaches for IT overall.
I look forward to you joining us for this evening – and please, to bring a colleague along – ideally someone who cares as much about the smooth release of software as you do!
Private Dining Room
15 York St., Toronto, ON. M5J 0A3
If you prefer I not forward you these types of communications, just let me know. To learn how to manage your contact preferences for other parts of Microsoft, please read our Privacy Statement.
| || |
| || || || |
I recently created a cool Sprint Report that is accurate to the hour.
We have two scrum teams working simultaneously on separate sprint backlogs. Using the Teams feature in TFS 2012 we have created two Sprint teams and assigned each one their own iterations.
We also added a field to the User Story called Current Sprint, and changed the workflow so that when the User Story is set to Active the workflow automatically sets Current Sprint to yes.
Then We wrote a query against the TFS Warehouse that grabs each teams backlog and sums the Story Points by State.
Another Query gets the current sprint based on the date for that team and calculates the number of days and the remaining days left in the sprint.
As you can see the result is a very nice concise report showing exactly where the team is to the nearest hour.
Also as you can see one team is mid sprint and the other is between sprints. The report reflects that also.
The following is excerpted from my just released book Windows Azure Data Storage (Wiley Press, Oct 2013). And, since the format is eBook only, there will be updates to the content as new features are added to the Azure Data Storage world.
Business craves data.
As a developer, this is not news to you. The people running businesses have wanted it for years. They demand data about how many widgets have been ordered, how much inventory is available to be used in manufacturing, how many accounts are more than 45 days past due. More recently, the corporate appetite for data has spread way past these snacks. They want to store information about how individual consumers navigate through their website. They want to keep track of how different metrics about the machines are used in the manufacturing process. They have hundreds of MB of documents, spreadsheets, pictures, audio, and video files that need to be stored and managed. And the volume of data that is collected grows by an obscene amount every single day.
What businesses plan on doing with this information depends greatly on the industry, as well as the type and quality of the data. Inevitably, the data needs to be stored. Fortunately (or it would be an incredibly short book) Windows Azure has a number of different data storage technologies that are targeted at some of the most common business scenarios. Whether you have transient storage requirements or the need for a more permanent resting place for your data, Windows Azure is likely to have you covered.
Business Scenarios for Storage
A feature without a problem to solve is like a lighthouse on a sunny day—no one really notices and it’s not really helping anyone. To ensure that the features covered in this book don’t meet the same fate, the rest of this chapter maps the Windows Azure Data Storage components and functionality onto problems that you are likely already familiar with. If you haven’t faced them in your own workplace, then you probably know people or companies that have. At a minimum, your own toolkit will be enriched by knowing how you can address common problems that may come up in the future.
A style of data storage that has recently received a lot of attention in the development community is NoSQL. While the immediate impression, given the name, is that style considers SQL to be an anathema, this is not the case. The name actually means Not Only SQL.
To a certain extent, the easiest way to define NoSQL is to look at what it’s not, as well as the niche it tries to fill. There is no question that the amount of data stored throughout the world is vast. And the volume is increasing at an accelerating rate. Studies indicate that over the course of four years (2008-2012), the total amount of digital data has increased by 500 percent. While this is not quite exponential growth, it is very steep linear growth. What is also readily apparent is that this growth is not likely to plateau in the near future.
Now think for a moment about how you might model this structure using a relational database. For relational databases, you would need tables and columns with foreign key relationships. For instance, start with a page table that has a URL column in it. A second table containing the links from that page to other pages would also be created. Each record in the second table would contain the key to the first page and the key to the linked-to page. In the relational database world, this is commonly how many-to-many relationships are created. While feasible, querying against this structure would be time consuming, as every single link in the network would be stored in that one, single table. And to this point, the contents of the page have not yet been considered.
NoSQL is designed to address these issues. To start, it is not a relational data store. Instead, there is no fixed schema and querying does not require any joins to be performed. At least, not in the traditional sense. Instead, NoSQL is a variation (depending on the implementation) of the key-value paradigm. In the Windows Azure world, different forms of NoSQL-style storage is provided through Tables and Blobs.
Any discussion of NoSQL tends to lead into the topic of Big Data. As a concept, Big Data has been generating a lot of buzz over the last 12-18 months. Yet, like the cloud before it, people find it challenging to define Big Data specifically. Sure, they know its “Big,” and they know that it’s “Data,” but beyond that, there is not a high level of agreement or understanding of the purpose and process of collecting and evaluating Big Data.
Most frequently, you read about Big Data in the context of Business Intelligence (BI). The goal of BI is to provide decision makers with the important information they need to make the choices that are inevitable in any organization. In order to achieve this goal, BI needs to gain access to data from a variety of sources within an organization, rationalize the definitions (i.e., make sure that the definition for common terms are the same across the different data sources), and present visualizations of the information to the user.
Based on the previous section, you might see why Big Data and NoSQL are frequently covered together. NoSQL supports large values of semi-structured data, and Big Data produces large volumes of semi-structured information. It seems like they are made for one another. Under the covers, they are. However, to go beyond Table, and Blob Storage, the front for Big Data in Windows Azure is Adobe Hadoop. Or, more accurately, the Azure HDInsight Services.
For the vast majority of developers, relational data is what immediately springs to mind when the term Data is mentioned. But since relational data has been intertwined with computers since the early in the history of computer programming, this shouldn’t be surprising.
With Windows Azure, there are two areas where relational data can live. First there are Window Azure Virtual Machines (Azure VMs), which are easy to create and can contain almost any database that you can imagine. Second, there are Windows SQL Azure databases. How you can configure, access and synchronize data with both of these modes are covered in detail in the book.
Messaging, message queues, and service bus have a long and occasionally maligned history. The concept behind messages and message queues are quite old (in technology terms) and, when used appropriately, are incredibly useful for implementing certain application patterns. In fact, many developers take advantage of the message pattern when they use seemingly non-messaging related technologies such as Windows Communication Foundation (WCF). If you look under the covers of guaranteed, in-order delivery using protocols, which don’t support such functionality (cough…HTTP…cough), you will see a messaging structure being used extensively.
In Windows Azure, basic queuing functionality is offered through Queue Storage. It feels a little odd to think of a message queue as a storage medium, yet ultimately that’s what it is. An application creates a message and posts it to the appropriate queue. That message sits there (that is to say, is stored) until a second application decides to remove it from the queue. So, unlike the data in a relational database, which is stored for long periods of time, Queue Storage is much more transient. But it still fits into the category of storage.
Windows Azure Service Bus is conceptually just an extension of Queue Storage. Messages are posted to and popped from the Service Bus. However, it also provides the ability for messages to pass between different networks, through firewalls, and even across corporate boundaries. Additionally, there is no requirement to open up an endpoint on either side of the communications channel that would expose the participant to external attacks.
It should be apparent even from just these sections that the level of integration between Azure and the various tools (both for developers and administrators) is quite high. This may not seem like a big deal, but anything that can improve your productivity is important. And deep integration definitely fits into that category. Second, the features in Azure are priced to let you plan with them at low or no cost. Most features have a long-enough trial period so that you can feel comfortable with the capabilities. Even after the trial, Azure bills based on usage, which means you would only be paying for what you use.
The goal of the book is to provide you with more details about the technologies introduced in this chapter. While the smallest detail of every technology is not covered, there is more than enough information for you to get started on the projects that you need to determine Azure’s viability in your environment.
Sometimes it amazes me how much of a polyglot that developers need to be to solve problems. Not really a polyglot, as that actually relates to learning multiple languages, but maybe a poly-tech.
Allow me to set the scenario. A client of ours is using Windows Azure Queue Storage to collect messages from a large number of different sources. Applications of varying types push messages into the queue. On the receiving side, they have a number of worker roles whose job it is to pull messages from the queue and process them. To give you a sense of the scope, there are around 50,000 messages per hour being pushed through the queues, and between 50-200 worker roles processing the messages on the other end.
For the most part, this system had been working fine. Messages come in, messages go out. Sun goes up, sun goes down. Clients are happy and worker roles are happy.
Then a new release was rolled out. And as part of that release, the number of messages that passed through the queues increased. By greater than a factor of two. Still, Azure prides itself on scalability and even at more than 100,000 messages per hour, there shouldn’t be any issues. Right?
Well, there were some issues as it turned out. The first manifested itself as an HTTP status 503. This occurred while attempting to retrieve a message from the queue. The status code 503 is used to indicate a service unavailable. Which seemed a little odd since not every single attempt to retrieve messages returned that status. Most requests actually succeeded.
Identifying the source of this problem required looking into the logs that are provided automatically by Azure. Well, automatically once you have turned logging on. A very detailed description of what is stored in these logs can be found here. The logs themselves can be found at http://<accountname>.blob.core.windows.net/$logs and what they showed was that the failing requests had a transaction status of ThrottlingError.
Azure Queue Throttling
A single Windows Azure Queue can process up to 2,000 transactions per second. The definition of a transaction is either a Put, a Get or a Delete operation. That last one might catch people by surprise. If you are evaluating the number of operations that you are performing, make sure to include the Delete in your count. This means that a fully processed message actually requires three transactions (because the Get is usually followed by a Delete in a successful dequeue function).
If you crack the 2,000 transactions per second limit, you start to get HTTP 503 status codes. The expectation is that your application will back off on processing when these 503 codes are received. Now the question of how an application backs off is an interesting one. And it’s going to depend a great deal on what your application is doing.
From my perspective, one of the most effective ways to handle this type of throttling is to redesign how the application uses queues. Not a complete redesign, but a shift in the queues being used. The key is found in the idea that the transactions per second limit is on a single queue. So by creating more queues, you can increase the number of transactions per second that your application can handle.
How you want to split your queues up will depend on your application. While there is no ‘right’ way I have seen a couple of different approaches. The first involved creating queues of different priorities. Then the messages being pushed into the queues can be done based on the relative priority.
A second way would be to create a queue for each type of message. This has the possibility of greatly increasing the number of queues. There are a number of benefits. The sender of the message does not have to be aware of the priority assigned to a message. They just submit a message to the queue with no concerns. That makes for a cleaner, simpler client. The worker is where control of where the priority lies. The worker can be pick and choose which queues to focus on based on whatever priority logic the application requires. This approach does presume that it’s easier to update the receiving workers then the clients, but you get the idea.
Now that the 503 messages were dealt with, we had to focus on what we perceived to be poor performance when retrieving messages from the queue. Specifically, we found (when we put a stop watch around the GetMessage call) that it was occasionally taking over 1000 milliseconds to retrieve the message. And the median seemed to be someplace in the 400-500 millisecond. This is an order of magnitude over the 50 milliseconds we were expecting.
This source of this particular problem was identified in conversation with a Microsoft support person. And when it was mentioned our collective response was ‘of course’. The requests were Nagling.
Some background might be required. Unless you are a serious poly-tech.
Nagle’s Algorithm is a mechanism by which the efficiency of TCP/IP communication can be improved. The problem Nagle addresses is when the data in the packets being sent are small. In that case, the size of the TCP header might actually be a very large percentage of the data being transmitted. The header for a TCP package is 40 bytes in size. If the payload was 5 or 10 bytes, that is a lot of overhead.
Nagle's algorithm combines these small outgoing messages into a single, larger message. The algorithm actually proscribes that as long as there is a sent packet for which the sender has received no acknowledgment from the recipient, the sender should keep combining payloads until a full packet’s worth is ready to be sent.
All of this is well and good. Until a sender using Nagle interacts with a recipient using TCP Delayed Acknowledgements. With delayed acknowledgements, the recipient may delay the ACK for up to 500ms to give the recipient a change to actually include the response with the ACK packet. Again, the idea is to increase the efficiency of TCP by reducing the number of ‘suboptimal’ packets.
Now consider how these two protocols work in conjunction (actually, opposition) with one another. Let’s say Fred is sending data to Barney. At the very end of the transmission, Fred has less than a complete packet’s worth of data to send. As specified in Nagle’s Algorithm, Fred will wait until it receives an ACK from Barney before it sends the last packet of data. After all, Fred might discover more information that needs to be sent. At the same time, Barney has implemented delayed acknowledgements. So Barney waits up to 500ms before sending an ACK in case the response can be sent back along with the ACK.
Both sides of the transmission end up waiting for the other. It is only the delayed acknowledgement timeout that breaks this impasse. And the result is the potential for occasionally waiting up to 500ms for a response to a GetMessage call. Sound familiar? That’s because it was pretty much exactly the problem we were facing.
There are two solutions to this problem. The first, which is completely unrealistic, is to turn off TCP delayed acknowledgments in Azure. Yeah, right. The second is much, much easier. Disable Nagle’s Algorithm in the call to GetMessage. In Azure, Nagle is enabled by default. To turn it off, you need to use the ServicePointManager .NET class.
CloudStorageAccount account = CloudStorageAccount.Parse(connectionString);
ServicePoint queueServicePoint =
ServicePointManager.FindServicePoint(account.QueueEndpoint); queueServicePoint.UseNagleAlgorithm = false;
So there you go. In order to be able to figure out why a couple of issues arose within Azure Queue Storage, you needed to be aware of HTTP status codes, the throttling limitations of Azure, queue design, TCP and John Nagle. As I initially started with, you need to be a poly-tech. And special thanks to Keith Hassen, who discovered much of what appears in this blog post while in the crucible of an escalating production problem.