Data as a Service and the Applications that consume it

Over the past few months I have seen quite a few really cool technologies released or announced, and I believe they have a very real potential in many markets.  A lot of companies that exist outside the realm of Software Development, rarely have the opportunity to use such technologies.

Take for instance the company I work for: Woodbine Entertainment Group.  We have a few different businesses, but as a whole our market is Horse Racing.  Our business is not software development.  We don’t always get the chance to play with or use some of the new technologies released to the market.  I thought this would be a perfect opportunity to see what it will take to develop a new product using only new technologies.

Our core customer pretty much wants Race information.  We have proof of this by the mere fact that on our two websites, HorsePlayer Interactive and our main site, we have dedicated applications for viewing Races.  So lets build a third race browser.  Since we already have a way of viewing races from your computer, lets build it on the new Windows Phone 7.

The Phone – The application

This seems fairly straightforward.  We will essentially be building a Silverlight application.  Let’s take a look at what we need to do (in no particular order):

  1. Design the interface – Microsoft has loads of guidance on following with the Metro design.  In future posts I will talk about possible designs.
  2. Build the interface – XAML and C#.  Gotta love it.
  3. Build the Business Logic that drives the views – I would prefer to stay away from this, suffice to say I’m not entirely sure how proprietary this information is
  4. Build the Data Layer – Ah, the fun part.  How do you get the data from our internal servers onto the phone?  Easy, OData!

The Data

We have a massive database of all the Races on all the tracks that you can wager on through our systems.  The data updates every few seconds relative to changes from the tracks for things like cancellations or runner odds.  How do we push this data to the outside world for the phone to consume?  We create a WCF Data Service:

  1. Create an Entities Model of the Database
  2. Create Data Service
  3. Add Entity reference to Data Service (See code below)
 
    public class RaceBrowserData : DataService
{ public static void InitializeService(DataServiceConfiguration config) { if (config
== null) throw new ArgumentNullException("config"); config.UseVerboseErrors
= true; config.SetEntitySetAccessRule("*", EntitySetRights.AllRead); //config.SetEntitySetPageSize("*",
25); config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2;
} } 

That’s actually all there is to it for the data.

The Authentication

The what?  Chances are the business will want to limit application access to only those who have accounts with us.  Especially so if we did something like add in the ability to place a wager on that race.  There are lots of ways to lock this down, but the simplest approach in this instance is to use a Secure Token Service.  I say this because we already have a user store and STS, and duplication of effort is wasted effort.  We create a STS Relying Party (The application that connects to the STS):

  1. Go to STS and get Federation Metadata.  It’s an XML document that tells relying parties what you can do with it.  In this case, we want to authenticate and get available Roles.  This is referred to as a Claim.  The role returned is a claim as defined by the STS.  Somewhat inaccurately, we would do this:
    1. App: Hello! I want these Claims for this user: “User Roles”.  I am now going to redirect to you.
    2. STS: I see you want these claims, very well.  Give me your username and password.
    3. STS: Okay, the user passed.  Here are the claims requested.  I am going to POST them back to you.
    4. App: Okay, back to our own processes.
  2. Once we have the Metadata, we add the STS as a reference to the Application, and call a web service to pass the credentials.
  3. If the credentials are accepted, we get returned the claims we want, which in this case would be available roles.
  4. If the user has the role to view races, we go into the Race view.  (All users would have this role, but adding Roles is a good thing if we needed to distinguish between wagering and non-wagering accounts)

One thing I didn’t mention is how we lock down the Data Service.  That’s a bit more tricky, and more suited for another post on the actual Data Layer itself.

So far we have laid the ground work for the development of a Race Browser application for the Windows Phone 7 using the Entity Framework and WCF Data Services, as well as discussed the use of the Windows Identity Foundation for authentication against an STS.

With any luck (and permission), more to follow.

Visual Studio 2010 RTM!

Earlier this morning, Microsoft launched Visual Studio 2010.  Woohoo!  here’s the jist:

Watch the Keynote and Channel 9 Live here: http://www.microsoft.com/visualstudio/en-us/watch-it-live

Get the real bits here (if you have an MSDN license): http://msdn.microsoft.com/en-ca/subscriptions/default.aspx

Get the trial bits here:

Get the Express versions here: http://www.microsoft.com/express/

All the important stuff you want/need to know about Visual Studio 2010 development: http://msdn.microsoft.com/en-ca/ff625297.aspx

Enjoy!

Bad User Interfaces are Insecure

The Best of Intentions

So you’ve built this application.  It’s a brilliant application.  It’s design is spectacular, the architecture is flawless, the coding is clean and coherent, and you even followed the SDL best practices and created a secure application.

There is one minor problem though.  The interface is terrible.  It’s not intuitive, and settings are poorly described in the options window.  A lot of people wouldn’t necessarily see this as a security issue, but more of an interaction bug -- blame the UX people and get on with your day.

Consider this (highly hyperbolic) options window though:

BadSecuritySettings

How intuitive is it?  Notsomuch, eh?  You have to really think about what it’s asking.  Worst of all, there is so much extraneous information there that is supposed to help you decide.

At first glance I’m going to check it.  I see “security” and “enable” in the text, and naturally assume it’s asking me if I want to make it run securely (lets say for the sake of argument it speaks the truth), because god knows I’m not going to read it all the way through the first time.

By the second round through I’ve already assumed I know what it’s asking, read it fully, get confused, and struggle with what it has to say.

A normal end user will not even get to this point.  They’ll check it, and click save without thinking, because of just that – they don’t want to have to think about it.

Now, consider this:

GoodSecuritySettings

Isn’t this more intuitive?  Isn’t it easier to look at?  But wait, does it do the same thing?  Absolutely.  It asks the user if they want to run a secure application.

The Path to Security Hell

When I first considered what I wanted to say on this topic, I asked myself “how can this really be classified as a security bug?”  After all, it’s the user’s fault for checking it right?

Well, no.  It’s our fault.  We developed it securely, we told them they needed it to be run securely, and we gave them the option to turn off security (again, hyperbole, but you get the point).  It’s okay to let them choose if they want to run an insecure application, but if we confuse them, if we make it difficult to understand what the heck is going on, they aren’t actually doing what they want and we therefore failed at making the application they wanted secure, secure.

It is our problem.

So what?

Most developers I know at the very least will make an attempt to write a secure application.  They check for buffer overflows, SQL Injection, Cross Site Scripting, blah blah blah.  Unfortunately some, myself included, tend to forget that the end users don’t necessarily know about security, nor care about it.  We do like most developers do.  We tell them what we know: “There has been a fatal exception at 0x123FF567!!one! The index was outside the bounds of the array.  We need to destroy the application threads and process.”

That sounds fairly familiar to most error messages we display to our end users.  Frankly, they don’t care about it.  They are just pissed the work they were doing was just lost.

The funny thing is, we really don’t notice this.  When I was building the first settings window above, I kept reading the text and thinking to myself, it makes perfect sense.  The reason for this is by virtue of the fact that what I wrote is my logic.  I wrote the logic, I wrote the text, I inherently understand what I wrote.  We do this all the time.  I do this all the time, and then I get a phone call from some user saying “wtf does this mean?”, aaaaaaand then I change it to something a little more friendly.  By the 4th or so iteration of this I usually get it right (or maybe they just get tired of calling?).

So what does this say about us? Well, I’m not sure. I think it’s saying we need to work on our user interface skills, and as an extension of that, we need to work on our soft skills – our interpersonal skills. Maybe. Just a thought.

A Stab at a New Resume

While I am definitely not looking for a new job, I was bored and thought I would take a stab at a stylized resume to see if I could hone some of my (lack of) graphics skills.  It didn’t turn out too badly, but I am certainly no graphics designer.

What do you think?

mainResume

Putting the I Back into Infrastructure

Tonight at the IT Pro Toronto we did a pre-launch of the Infrastructure 2010 project.  Have you ever been in a position where you just don’t have a clear grasp of a concept or design?  It’s not fun.  As a result, CIPS Toronto, IT Pro Toronto, and TorontoSQL banded together to create a massive event to help make things a little more clear.  To give you a clearer understanding of how corporate networks work.  Perhaps to explain why some decisions are made, and why in retrospect, some are bad decisions.

Infrastructure 2010 is about teaching you everything there is to know about a state-of-the-art, best practices compliant, corporate intranet.  We will build, from the ground up, an entire infrastructure.  We will teach you how to build, from the ground up, an entire infrastructure.

Sessions are minimum 300 level, and content-rich.  Therefore:

i2010Proud[1]

Well, maybe.  (P.S. if you work for Microsoft, pretend you didn’t see that picture)

My First CodePlex Project!

A few minutes ago I just finalized my first CodePlex project.  While working on the ever-mysterious Infrastructure 2010 project, I needed to integrate the Live Meeting API into an application we are using.  So I decided to stick it into it’s own assembly for reuse.

I also figured that since it’s a relatively simple project, and because for the life of me I couldn’t find a similar wrapper, I would open source it.  Maybe there is someone out there who can benefit from it.

The code is ugly, but it works.  I suspect I will continue development, and clean it up a little.  With that being said:

  • It needs documentation (obviously).
  • All the StringBuilder stuff should really be converted to XML objects
  • It need's cleaner exception handling
  • It needs API versioning support
  • It needs to implement more API functions

Otherwise it works like a charm.  Check it out!

Six Simple Development Rules (for Writing Secure Code)

I wish I could say that I came up with this list, but alas I did not.  I came across it on the Assessment, Consulting & Engineering Team blog from Microsoft, this morning.  They are a core part of the Microsoft internal IT Security Group, and are around to provide resources for internal and external software developers.  These 6 rules are key to developing secure applications, and they should be followed at all times.

Personally, I try to follow the rules closely, and am working hard at creating an SDL for our department.  Aside from Rule 1, you could consider each step a sort of checklist for when you sign off, or preferably design, the application for production.

--

Rule #1: Implement a Secure Development Lifecycle in your organization.

This includes the following activities:

  • Train your developers, and testers in secure development and secure testing respectively
  • Establish a team of security experts to be the ‘go to’ group when people want advice on security
  • Implement Threat Modeling in your development process. If you do nothing else, do this!
  • Implement Automatic and Manual Code Reviews for your in-house written applications
  • Ensure you have ‘Right to Inspect’ clauses in your contracts with vendors and third parties that are producing software for you
  • Have your testers include basic security testing in their standard testing practices
  • Do deployment reviews and hardening exercises for your systems
  • Have an emergency response process in place and keep it updated

If you want some good information on doing this, email me and check out this link:
http://www.microsoft.com/sdl

Rule #2: Implement a centralized input validation system (CIVS) in your organization.

These CIVS systems are designed to perform common input validation on commonly accepted input values. Let’s face it, as much as we’d all like to believe that we are the only ones doing things like, registering users, or recording data from visitors it’s actually all the same thing.

When you receive data it will very likely be an integer, decimal, phone number, date, URI, email address, post code, or string. The values and formats of the first 7 of those are very predictable. The string’s are a bit harder to deal with but they can all be validated against known good values. Always remember to check for the three F’s; Form, Fit and Function.

  • Form: Is the data the right type of data that you expect? If you are expecting a quantity, is the data an integer? Always cast data to a strong type as soon as possible to help determine this.
  • Fit: Is the data the right length/size? Will the data fit in the buffer you allocated (including any trailing nulls if applicable). If you are expecting and Int32, or a Short, make sure you didn’t get an Int64 value. Did you get a positive integer for a quantity rather than a negative integer?
  • Function: Can the data you received be used for the purpose it was intended? If you receive a date, is the date value in the right range? If you received an integer to be used as an index, is it in the right range? If you received an int as a value for an Enum, does it match a legitimate Enum value?

In a vast majority of the cases, string data being sent to an application will be 0-9, a-z, A-Z. In some cases such as names or currencies you may want to allow –, $, % and ‘. You will almost never need , <> {} or [] unless you have a special use case such as http://www.regexlib.com in which case see Rule #3.

You want to build this as a centralized library so that all of the applications in your organization can use it. This means if you have to fix your phone number validator, everyone gets the fix. By the same token, you have to inspect and scrutinize the crap out of these CIVS to ensure that they are not prone to errors and vulnerabilities because everyone will be relying on it. But, applying heavy scrutiny to a centralized library is far better than having to apply that same scrutiny to every single input value of every single application.  You can be fairly confident that as long as they are using the CIVS, that they are doing the right thing.

Fortunately implementing a CIVS is easy if you start with the Enterprise Library Validation Application Block which is a free download from Microsoft that you can use in all of your applications.

Rule #3: Implement input/output encoding for all externally supplied values.

Due to the prevalence of cross site scripting vulnerabilities, you need to encode any values that came from an outside source that you may display back to the browser. (even embedded browsers in thick client applications). The encoding essentially takes potentially dangerous characters like < or > and converts them into their HTML, HTTP, or URL equivalents.

For example, if you were to HTTP encode <script>alert(‘XSS Bug’)</script> it would look like: &lt;script&gt;alert('XSS Bug')&lt;/script&gt;  A lot of this functionality is build into the .NET system. For example, the code to do the above looks like:

Server.HtmlEncode("<script>alert('XSS Bug')</script>");

However it is important to know that the Server.HTMLEncode only encodes about 4 of the nasty characters you might encounter. It’s better to use a more ‘industrial strength’ library like the Anti Cross Site Scripting library. Another free download from Microsoft. This library does a lot more encoding and will do HTTP and URI encoding based on a white list. The above encoding would look like this in AntiXSS

using Microsoft.Security.Application;
AntiXss.HtmlEncode("<script>alert('XSS Bug')</script>");

You can also run a neat test system that a friend of mine developed to test your application for XSS vulnerabilities in its outputs. It is aptly named XSS Attack Tool.

Rule #4: Abandon Dynamic SQL

There is no reason you should be using dynamic SQL in your applications anymore. If your database does not support parameterized stored procedures in one form or another, get a new database.

Dynamic SQL is when developers try to build a SQL query in code then submit it to the DB to be executed as a string rather than calling a stored procedures and feeding it the values. It usually looks something like this:

(for you VB fans)

dim sql
sql = "Select ArticleTitle, ArticleBody FROM Articles WHERE ArticleID = "
sql = sql & request.querystring("ArticleID")
set results = objConn.execute(sql)

In fact, this article from 2001 is chock full of what NOT to do. Including dynamic SQL in a stored procedure.

Here is an example of a stored procedure that is vulnerable to SQL Injection:

Create Procedure GenericTableSelect @TableName VarChar(100)
AS
Declare @SQL VarChar(1000)
SELECT @SQL = 'SELECT * FROM '
SELECT @SQL = @SQL + @TableName
Exec ( @SQL) GO

See this article for a look at using Parameterized Stored Procedures.

Rule #5: Properly architect your applications for scalability and failover

Applications can be brought down by a simple crash. Or a not so simple one. Architecting your applications so that they can scale easily, vertically or horizontally, and so that they are fault tolerant will give you a lot of breathing room.

Keep in mind that fault tolerant is not just a way to say that they restart when they crash. It means that you have a proper exception handling hierarchy built into the application.  It also means that the application needs to be able to handle situations that result in server failover. This is usually where session management comes in.

The best fault tolerant session management solution is to store session state in SQL Server.  This also helps avoid the server affinity issues some applications have.

You will also want a good load balancer up front. This will help distribute load evenly so that you won’t run into the failover scenario often hopefully.

And by all means do NOT do what they did on the site in the beginning of this article. Set up your routers and switches to properly shunt bad traffic or DOS traffic. Then let your applications handle the input filtering.

Rule #6: Always check the configuration of your production servers

Configuration mistakes are all too popular. When you consider that proper server hardening and standard out of the box deployments are probably a good secure default, there are a lot of people out there changing stuff that shouldn’t be. You may have remembered when Bing went down for about 45 minutes. That was due to configuration issues.

To help address this, we have released the Web Application Configuration Auditor (WACA). This is a free download that you can use on your servers to see if they are configured according to best practice. You can download it at this link.

You should establish a standard SOE for your web servers that is hardened and properly configured. Any variations to that SOE should be scrutinised and go through a very thorough change control process. Test them first before turning them loose on the production environment…please.

So with all that being said, you will be well on your way to stopping the majority of attacks you are likely to encounter on your web applications. Most of the attacks that occur are SQL Injection, XSS, and improper configuration issues. The above rules will knock out most of them. In fact, Input Validation is your best friend. Regardless of inspecting firewalls and things, the applications is the only link in the chain that can make an intelligent and informed decision on if the incoming data is actually legit or not. So put your effort where it will do you the most good.

Security, Security, Security is about Policy, Policy, Policy

The other day I had the opportunity to take part in an interesting meeting with Microsoft. The discussion was security, and the meeting members were 20 or so IT Pro’s, developers, and managers from various Fortune 500 companies in the GTA. It was not a sales call.

Throughout the day, Microsofties Rob Labbe and Mohammad Akif went into significant detail about the current threat landscape facing all technology vendors and departments. There was one point that was paramount. Security is not all about technology.

Security is about the policies implemented at the human level. Blinky-lighted devices look cool, but in the end, they will not likely add value to protecting your network. Here in lies the problem. Not too many people realize this -- hence the purpose of the meeting.

Towards the end of the meeting, as we were all letting the presentations sink in, I asked a relatively simple question:

What resources are out there for new/young people entering the security field?

The response was pretty much exactly what I was (unfortunately) expecting: notta.

Security it seems is mostly a self-taught topic. Yes there are some programs at schools out there, but they tend to be academic – naturally. By this I mean that there is no fluidity in discussion. It’s as if you are studying a snapshot of the IT landscape that was taken 18 months ago. Most security experts will tell you the landscape changes daily, if not multiple times a day. Therefore we need to keep up on the changes in security, and any teacher will tell you, it’s impossible in an academic situation.

Keeping up to date with security is a manual process. You follow blogs, you subscribe to newsgroups and mailing lists, your company gets hacked by a new form of attack, etc., and in the end you have a reasonable idea of what is out there yesterday. And you know what? This is just the attack vectors! You need to follow a whole new set of blogs and mailing lists to understand how to mitigate such attacks. That sucks.

Another issue is the ramp up to being able to follow daily updates. Security is tough when starting out. It involves so many different processes at so many different levels of the application interactions that eyes glaze over at the thought of learning the ins and outs of security.

So here we have two core problems with security:

  1. Security changes daily – it’s hard to keep up
  2. It’s scary when you are new at this

Let’s start by addressing the second issue. Security is a scary topic, but let’s breaks it down into its core components.

  1. Security is about keeping data away from those who shouldn’t see it
  2. Security is about keeping data available for those who need to see it

At its core, security is simple. It starts getting tricky when you jump into the semantics of how to implement the core. So let’s address this too.

A properly working system will do what you intended it to do at a systematic level: calculate numbers, view customer information, launch a missile, etc. This is a fundamental tenant of application development. Security is about understanding the unintended consequences of what a user can do with that system.

These consequences are of the like:

  • SQL Injection
  • Cross Site Scripting attacks
  • Cross Site Forgery attacks
  • Buffer overflow attacks
  • Breaking encryption schemes
  • Session hijacking
  • etc.

Once you understand that these types of attacks can exist, everything is just semantics from this point on. These semantics are along the line of figuring out best practices for system designs, and that’s really just a matter of studying.

Security is about understanding that anything is possible. Once you understand attacks can happen, you learn how they can happen. Then you learn how to prevent them from happening. To use a phrase I really hate using, security is about thinking outside the box.

Most developers do the least amount of work possible to build an application. I am terribly guilty of this. In doing so however, there is a very high likelihood that I didn’t consider what else can be done with the same code. Making this consideration is (again, lame phrase) thinking outside the box.

It is in following this consideration that I can develop a secure system.

So… policies?

At the end of the day however, I am a lazy developer.  I will still do as little work as possible to get the system working, and frankly, this is not conducive to creating a secure system.

The only way to really make this work is to implement security policies that force certain considerations to be made.  Each system is different, and each organization is different.  There is no single policy that will cover the scope of all systems for all organizations, but a policy is simple. 

A policy is a rule that must be followed, and in this case, we are talking about a development rule.  This can include requiring certain types of tests while developing, or following a specific development model like the Security Development Lifecycle.  It is with these policies that we can govern the creation of secure systems.

Policies create an organization-level standard.  Standards are the backbone of security.

These standards fall under the category of semantics, mentioned earlier.  Given that, I propose an idea for learning security.

  • Understand the core ideology of security – mentioned above
  • Understand that policies drive security
  • Jump head first into the semantics starting with security models

The downside is that you will never understand everything there is to know about security.  No one will.

Perhaps its not that flawed of an idea.

ASP.NET WebForms are NOT Being Overthrown by MVC

It’s always a fun day when the man himself, ScottGu responds to my email.  Basically it all started last week at Techdays in Toronto (pictures to follow, I promise). 

Quite a few people asked me about MVC, and whether or not it will replace Web Forms.  My response was that it wouldn’t, but I didn’t have any tangible proof.  I discussed new features in .NET 4.0, and how the development is still going strong for future releases.  Some didn’t buy it.

So, earlier today I emailed Scott and asked him for proof.  This was his response:

Hi Steve,

Web Forms is definitely not going away – we are making substantial improvements to it with ASP.NET 4.0 (I’m doing a blog series on some of the improvements now).  ASP.NET MVC provides another option people can use for their UI layer – but it is simply an option, not a replacement.

In terms of the dev team size, the number of people on the ASP.NET team working on WebForms and MVC is actually about equal.  All of the core infrastructure investments (security, caching, config, deployment, etc) also apply equally to both.

Now, MVC is new.  MVC is powerful.  MVC is pretty freakin cool in what it can do.  But it won’t replace WebForms.  Frankly, I like WebForms.  MVC does have it’s place though.  I can see a lot benefits to using it.  It alleviates a lot of boilerplate code in certain development architectures, and that is never a bad thing.

Long Live WebForms!

The RACI Model

Definition: a model used to help define who is responsible / accountable; The RACI model is built around a simple 2-dimensional matrix which shows the 'involvement' of Functional Roles in a set of Activities. 'Involvement' can be of different kinds: Responsibility, Accountability, Consultancy or Informational (hence the RACI acronym). The model is used during analysis and documentation efforts in all types of Service Management, Quality Management, Process- or Project Management. A resulting RACI chart is a simple and powerful vehicle for communication. Defining and documenting responsibility is one of the fundamental principles in all types of Governance (Corporate-, IT-Governance).

What does that mean?  All projects require management.  Simple enough.  This model is designed to define each level of management and required interaction on a project or application.  The four core levels of involvement attempt to define who should know what about the project/application/system.  Each level has more direct interaction than the previous level.

The levels are defined as:

Responsible

Those who do the work to achieve the task. There is typically one role with a participation type of Responsible, although others can be delegated to assist in the work required (see also RASCI below for separately identifying those who participate in a supporting role).

Accountable (also Approver or final Approving authority)

Those who are ultimately accountable for the correct and thorough completion of the deliverable or task, and the one to whom Responsible is accountable. In other words, an Accountable must sign off (Approve) on work that Responsible provides. There must be only one Accountable specified for each task or deliverable.

Consulted

Those whose opinions are sought; and with whom there is two-way communication.

Informed

Those who are kept up-to-date on progress, often only on completion of the task or deliverable; and with whom there is just one-way communication.

Very often the role that is Accountable for a task or deliverable may also be Responsible for completing it (indicated on the matrix by the task or deliverable having a role Accountable for it, but no role Responsible for its completion, i.e. it is implied). Outside of this exception, it is generally recommended that each role in the project or process for each task receive, at most, just one of the participation types. Where more than one participation type is shown, this generally implies that participation has not yet been fully resolved, which can impede the value of this technique in clarifying the participation of each role on each task.

Note: I stole most of that from Wikipedia.