Blind Keys & Stuff

I got into an interesting discussion with a colleage today about building data access components, and passing arguments, often blind/surrogate keys.

I'll quickly state a point about blind/surrogate keys: They are good, I like them, I wish every DBA used them universally on every table. I much prefer developing software against that model and when I'm the data architect on a project, I do it that way. But having said that, for building frameworks and design patterns, you can't assume that you'll always have them.

So blind keys are called just that because they are blind to the user. The user doesn't type in some long integer or guid to load up some data on the form. Almost by definition though I'd maintain that if you have a blind key, you must have a visible key, that is, some combination of data that a user can use to uniquely identify an entity. We call these “Alternate Keys“ and they are ususally implemented in addition to blind keys as a unique index or constraint on your database.

Why do we use blind keys? We do it so that we can be more durable in the face of changing specifications, business rules - and perhaps originally because of poorly understood requirements. When a primary key changes (like an additional column added) it typically has the affect of trickling through all referencing table's foreign keys. “Oh sorry, I forgot to mention that “Customer Account Number“ isn't really unique any more since we acquired XYZ corp. We should add “Division“ to the primary key“. All of a sudden you have to add “Division“ to every table you had in your system that was referencing Customer to maintain the foreign key. This is fragile not just because it requires adding columns to the database, but also because it requries populating that column and doing some data migration. It also means that queries that were optimized and understood to be singleton's no longer work against our database.

But does creating a CustomerID blind key completely make you durable to this change? It get's you pretty far in your database but it doesn't help your application much. Getting back to my assumption that even though you have a blind key doesn't mean you don't have (and should use in your application) a visible key.

Users may pick a customer out of a list or a search result dialog, and you might open that customer up in a details form. I would assume that this form's argument would take some kind of key passed to it (and passed down to the business object and data access component). If you are starting with a grid form list of customers, possibly generated from search, it's pretty tempting to just lift the primary key (blind key column) out the grid and pass it to the other form.

Is this kosher? What if you decide reuse this Customer Details and call it from an Order form (i.e. click here to see customer details). If the Order is stored in the same database, it likely has the same customerId blind key - so that would be handy their too. But what if you are intergrating this with other systems. What if you are allowing a partner to call a web service to request customer details? Should the partner pass the blind key? I would think that the blindness of the key is not so blind anymore if you did that. That get's us back to the alternate key or the user key. This is about a service boundary and it's similar to the same service boundary that exists between the user selecting a customer from a grid. What on that grid let them know they were opening the correct customer? If the business rules are really poorly defined it may in fact be every column on the table - but let's not go there.

A Web Service should in fact act as a service interface. You may have written your Customer business object's Load method to accept a customerId, That doesn't mean you Web Service API should be the same. It in fact should probably take Account Number and Division and translate that some how into a CustomerID before turning around and calling the Customer.Load(custId) method.

DataSet Performance Tips with Indicies

Each of these tips relate to speeding up the searching of data in a dataset. Although indexes are created in a dataset they are not exposed. Knowing how to make ADO .net generate them gives you the power to enhance the performance of you DataBase application.

Tip #1: ADO .net generates indexes on the primary key(s).

If you are searching on primary key fields instead of using DataTable.Select(), use DataTable.Rows.Find(). The Find method will use the index to find the row. Knowing this may also make you think differently when defining primary keys.

Tip #2: Use a DataView.

DataViews will generate an index for a sorted column. If you find you are searching for data using a particular column. It would be a good idea to create a Dataview sorted by the column(s) you need to search. Then use the Dataview.FindRows() method to get at your data. The DataView will have created an index for the sort column, and FindRows() will use the index.

Tip #3: Be leary of Tip #2.

If you are trying to take advantage of DataView indexes it's important to know when the indexes are created. ADO .net generates an index each time you set the filter, sort, and rowversion properties of the DataView. So if you create a DataView and set these properties individually the indexes will be created serveral times. To avoid this be sure to use the DataView constructor that takes the table, filter, sort and rowversion as arguments. (See example below) This way the indexes will be created correctly the first time.

private void MakeDataView(DataSet ds)
   DataView dv = new DataView(ds.Tables["Suppliers"], "Country = 'CA'", "CompanyName", DataViewRowState.CurrentRows);
   dv.AllowEdit = true;
   dv.AllowNew = true;
   dv.AllowDelete = true;

Using a DataAdapter to create transaction history

In some mission critical applications, it's important create a log or audit trail against updates made against a specific table. Sometimes this could be done with triggers if you have all the information available, or perhaps your updates are encapsulated inside of stored procedures so you can add central code their. What you may not have known is that if you are using a DataAdapter to read & write to a table that needs to be audited, then you can accomplish the task with the DataAdapter.

The secret to this tip is that a SqlCommand object, like the Insert, Update and Delete command members of the DataAdapter can execute more than one SQL statement. In fact, DataAdapters use this technique by default to refresh a DataSet after an update or Insert. This is important to retrieve identity column values, other column defaults or perhaps values that have been modified by triggers.

To include additional SQL statements, separate them with a semi-colon. You have full access to the parameters collection as well. In the text below is an example update command text that has been modified to insert into a log table, that contains an columns: Action, OldCategory, NewCategory. This text can be assigned to the UpdateCommand's CommandText property inside the DataAdapter.

UPDATE    Categories
   set    CategoryName = @CategoryName,
          Description = @Description,
          Picture = @Picture
WHERE     (CategoryID = @Original_CategoryID) 
      AND (CategoryName = @Original_CategoryName);
SELECT    CategoryID, CategoryName, Description, Picture
 FROM     Categories
WHERE     (CategoryID = @CategoryID);
          (Action, OldCategory, NewCategory)
          VALUES ('Updated",@Original_CategoryName, @CategoryNam)

Creating Computed Columns in a DataSet

DataSets are very powerful - and one of the features that makes it so powerful is computed columns. You can create a DataColumn that does not contain storage of data but rather computes a value dynamically.

Computations can be done by SQL on retrieval, however once the data is in a DataSet, those computations are outdated. As the data changes in your DataSet, it is desirable to have the latest results of a calculation.

One of the properties of a DataColumn is "Expression". This is a string that stores a computational expression. The data type of the column should match the data type of the computed value. These expressions can calculate simple values, or you can use aggregate functions across an entire set of data. You can even use computed columns to lookup a value in a parent table.

The source code below contains three snippets of code that add computed columns to some tables. Listing 1 is a simple expression that calculates the extended price on an Order Item by multiplying the quantity by a unit price. The column is first created and then added to the table. 

The first listing assumed there was not only a Quantity column on the OrderItem Table, but also that there was a Price column. In all likelihood, you would need to look this up on a related product table assuming the OrderItem table had a foreign key to a Product Table. If you have set your DataSet up with a DataRelation for this foreign key, you can perform this lookup within the DataSet. In Listing 2, we create a computed Price column on the OrderItem table that looks up the Price on the Product Table. The syntax for referencing the rows in a parent table is:


...where DataRelationName is the name of the DataRelation between the parent and child tables. To perform parent table referencing, it is mandatory that a DataRelation is established between the parent and child tables.

The final example in Listing 3, shows a computed column that is a total of all the order item amounts for a given OrderHeader. Assuming we have created a DataRelation for the foreign key between OrderHeader and OrderItem, we can perform this child table reference. The syntax is similar to the parent relation but you use the keyword Child.


You should note that when referencing a child table, you are normally referring to many rows (possibly many anyway). Therefore any expressions on a child table should be done within an aggregate expression (e.g. Sum, Min, Max, Avg, etc.).

The benefit of performing your calculations within the DataSet instead of the Database is that changing values in the base columns of the DataSet will have an instant impact on your calculations. For example, users may want to know the total of an Order before they actually save it in the database. In the second example, if you change the product_id being order, the unit price and extended price will automatically change to reflect the price of the new product.

Finally, if you place this code to modify your DataSet  within you Business Logic Layer, you are able to place the critical business logic of calculations in your business objects along with the rest of your business logic, not hidden away in your Data Access Layer or database stored procedures.

You should note that in the current version of the Visual Studio.NET IDE, there is a bug with computed columns and typed Datasets. You can add the computed column to your DataSet in the schema editor, but if you specify the expression, you will get compile time errors. You should specify the expression at runtime.

[Listing 1]
stDecimal = System.Type.GetType("System.Decimal");
DataColumn dcExtPrice = new DataColumn("ExtPrice", stDecimal, "Price * Quantity", MappingType.Attribute);
dcExtPrice.AutoIncrement = false;
dcExtPrice.ReadOnly = true;
[Listing 2]
stDecimal = System.Type.GetType("System.Decimal");
DataColumn dcPrice = new DataColumn("Price", stDecimal, "Parent(drProduct_OrderItem).Price", MappingType.Attribute);
dcPrice.AutoIncrement = false;
dcPrice.ReadOnly = true;

[Listing 3]
stDecimal = System.Type.GetType("System.Decimal");
DataColumn dcTotal = new DataColumn("Total", stDecimal, "Sum(Child(drOrderHeader_OrderItem).Price)", MappingType.Attribute);
dcTotal.AutoIncrement = false;
dcTotal.ReadOnly = true;

When RFID goes bad.

You never know how your technology is going to be used.

Golf should be 12 holes.

I'm looking for investors!

This idea will change golf forever. I think I can at least double this industry's revenue with one small inexpensive change.

I want to open a golf course where a round of golf is 12 holes. Don't go, hear me out.

I golf occasionally. I would golf more but it takes too damn long. I don't have 4-5 hours in a day to spend golfing. Not to mention the time spent at the 19th hole.

Inevitably every time I golf I get to about the 14th hole and I spend the rest of the game wishing it was over.

12 holes just makes sense. Here are some reasons why.

  1. You could play 6 holes in the morning before work without having to go to an executive course.
  2. However over lunch you could easily play a 6 hole executive course.
  3. All the courses out there now don't have to change anything but the numbers at each hole. Every 18 hole golf course just has 3 6's.
  4. They can get more people golfing starting 3 at a time like they do on 27 hole courses now.
  5. If a round is just over 2 hours instead of 4. More people can golf in a day.
  6. It will be easier to get a tee off time.
  7. More people would golf because they wouldn't be saying “I wish this was over” on the 14th hole.
  8. More people would golf because it would be less expensive.
  9. Yet the courses would make more due to the increase in volume and staggered tee offs.
  10. For the avid golfer who wants to spend more time on the course. They can still do 18 by golfing 12 and then the  front 6 again.
  11. Some people even say they want to golf two full rounds or 36 holes. No problem, for 36 holes you would get in 3 rounds that day.

Now there are people who don't like change. These people will say golf has always been 18 holes and that it's a tradition. This is not true. Please read the article below. ( I could have just typed this in but it's much more effective this way, don't you think)

OK, so who wants to be in on the ground floor of this.

Maybe you don't want in? But if you have more reasons why golf should be 12 holes please share them with me.

A New ObjectSharpee

As of tomorrow, ObjectSharp welcomes another member to our team.  For those of you who already know her, the color of this post is a giveaway clue.  If you're not familiar with her work, you can read more about it here.

So you think you know Pointers?

This might be a bit of fun for the old C programmers out there like me.

A long time ago I used to teach C programming. This was before C++ and Windows. When we used to have to write whole applications that fit into 250K of Ram because the rest of the 640K was used up by various hardware drivers, and DOS, and you had to use tools like Quarterdeck QEMM to move drivers into memory locations above 640K. These were the days when hardware was expensive and developers had to really worry about memory usage and pointers and stacks..... Oh sorry I got a little carried away there, that is not what this Blog is about.

Where was I? Oh yeah I used to teach C, at the end of every test I used to have bonus questions for my students. I found one the other day and thought I would put it on my Blog for fun. This particular bonus question was meant to test their understanding of the pointer operators ( * & ), amoung other things.

Now to use pointers in C#, I had to make the code unsafe.  With todays compilers and editors, it would be easy to see what this little method does. So see if you can figure it out just by looking at the code right here in my Blog.

Can you tell me what i will equal, and therefore what will be written to the console.

unsafe static void Foo()
      int i, k=5;     /* Declare some Integers */
      int *j;           /* Declare a pointer */

      j = &k;        /* Assign a Value */
      i = 10/*j;     /* Do the Math */;

      Console.WriteLine( i.ToString() );

If you are thinking this is too easy, and the answer is 2. Try again.

Regular Expressions

Regular expressions are a very powerful tool for validating or parsing a string. I don't claim to be an expert in the use of Regular expressions by any means. But I have used them with great success on a number of occasions. They are very useful. I have used them to validate a special URL that is used to create hyperlinks to forms within a smart client application, or to parse the syntax of a Powerbuilder DataWindow to name just two.

In ASP.NET there is a Regular Expression validator control. This control lets you validate input based on Regular Expressions. The nice thing about this control is the hard work is done for you for a variety of different strings. For example there is a sample expression for internet email addresses. \w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)* and one for an Internet URL http://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)?

If you want to use Regex in .NET (C# or VB.NET) getting the expression right is the tough part, as you can see from the expressions above. The rest is pretty easy. To help you out there is a tool written by Eric Gunnerson called Regular Expression Workbench that I have found very useful on many occasions. You can down load it from gotdotnet. This wonderful tool will help you format your expression, interpret or execute your expression on a sample string, even tell you how efficient the operation is.

For those of you who don't really know how to use Regular Expressions here is a simple example to get you started. Using this example you can test a string for swear words.

First: to use Regular Expressions you must use the System.Text.RegularExpressions namespace.

Then create a regular expression and call the match method to search your string.

 using System.Text.RegularExpressions;
 Regex r;                                                     //Declare Regular expression
 r = new Regex("hell");                                //Create expression
 Match m = r.Match(this.textBox1.Text);    //Call match method

 if (m.Success)                                           //Check to see if a match was made
      MessageBox.Show("You can't say that.");
      MessageBox.Show("nothing found.");

The above example works to find the word hell but it will also find it inside the word hello. You can add the escape character /b to denote word boundaries. For example if your expression is /bhell/b it will only match on the word hell. /b has two uses in a regular expression it's a word boundary. When used within a [ ] character class it refers to the backspace character.

Fine so you can now validate a string to keep users from entering a swear word. "But Dave there are many words I need to check for.", you say. Not to worry this can be done in the same expression using a pipe. If you separate each word you want to find with a pipe Match( ) will look for all the words listed. See the example below.

using System.Text.RegularExpressions;
Regex r;                                                    //Declare Regular expression
 r = new Regex("
\\bhell\\b|\\bfrig\\b");        //Create expression
 Match m = r.Match(this.textBox1.Text);  //Call match

 if (m.Success)                                         //Check to see if a match was made
  MessageBox.Show("You can't say that.");
  MessageBox.Show("nothing found.");

Sun's take on the smart client

While Sun is very complimentary about Microsoft's strengths on the smart client, Sun's answer seems to be....put HTML on the client. One of their reasons is that “The skill-set requirements and the complexity of HTML are much lower than for .NET”. Are you kidding me? Last time I built a Windows Forms application I didn't have to test it on 5 different browsers. Don't even get me started on CSS and javascript. The metrics that I see suggest that it is no less than twice the development/testing effort for a Web Form over a Win Form application. That's conservative and sometimes goes up to 3-4 times.

Sun also fails to address the need for the occassionally connected/offline access application.