When is a database oriented as a service?

Do you consider your database as a service? It's worthwhile to review the tenents of a service oriented architecture. The first two tenents above are probably the most relevant to my question.

If you do all of your data access through stored procedures, then you might say your database boundary is explicit.

If your database doesn't depend on other services or applications to exist properly, then you could say that your database is autonomous. That's a little tricky. Although we may use stored procedures to access functionality in our database, we may have well known  practices that we have to call the ap_decrease_inventory  proc after we call the ap_ship_order proc to make sure our that our database values are all in check. I wouldn't call our database autonomous if it has to rely on these external rules being inforced.

I'm going to avoid the discussion of the last two tenents because I think the are a bit to pure for my question. I'm really just trying to differentiate between two types of databases that I see out there. For my purposes, I refer to these as Databases as Services, and Databases as File Systems.

Databases as Services typically are well encapsulated and contain business rules. These databases might be supporting several client applications. You probably take great care in these databases, designing them carefully, perhaps with modeling tools, and encapsulating the persistence function with stored procedures, functions, triggers, etc. You may or may not have a well defined data access layer in your client applications. You might consider all the stored procs to be your data access layer, so you might call you procs directly from UI and/or business layers of your application, but that really depends on how well your client application is written. From you database, you don't really care so much since it's well protected service that operates autonomously.

Databases as File Systems are much less strategic. They serve one purpose only - to save stuff from your application. You probably/hopefully have a well defined data access layer in your application. That may even be an Object Relational Mapping tool (ORM). You probably designed the database to support the persistence of the objects in your application, and to generalize, you probably only have one application using this database. The most important thing though is that all of your business rules should be in your application(s). This type of database doesn't mean you don't have db side logic such as stored procedures or triggers. You may decide for optimization reasons that some code needs to live closer to the tables and that's okay. It's okay, so long as you realize it's harder to reuse some of that logic in higher layers of your application and you are comfortable in having your logic live in multiple platforms.

Stored Procedures are increasingly being used to add encapsulation to our database. No longer is performance the rationale for stored procedures. And increasingly, we are seeing advanced services in our databases - 4GL code such as Java and .NET managed code are making their ways into our databases. User Defined Types, Objects, and with the next version of SQL Server, we're seeing a full fledged message queue mechanism with Service Broker. You can even host web services directly in SQL Server 2005.

Is your database a service? Which camp do you fall into? Unfortunately, I think many people live somewhere in between, and that isn't by design. Most of the architectural decisions here should be motivated by where you decide to draw your boundary for strategic reasons, not for what is handy at the moment. I'd like to see people more consciously make this decision and remain committed to it. What are your thoughts?

15% off of MCSD/MCAD Certification Exams @ Pearson/VUE

Use the following voucher number MSAU113E1020. Good until August 31, 2005.

SQL Server 2005 "Yukon" Virtual "Hands On" Online Labs

SQL Server 2005 “Yukon” is in beta, but if you'd rather not go through the hassle of installing it (and trying to uninstall it) you can experience it through the magic of online virtual labs.

http://msdn.demoservers.com/login.aspx?group=sql2005

Not only is it a hosted environment, but guided hands on labs are provided for the following topics:

  • SQL Server 2005 Integration Services
  • SQL Server 2005 Introduction to SQL Server Management Studio
  • SQL Server 2005 Reporting Services
  • SQL Server 2005 Server Management Objects
  • SQL Server 2005 SQL CLR Integration
  • SQL Server 2005 SQL Query Tuning
  • SQL Server 2005 SQL Server and ADO.NET 
  • SQL Server 2005 T-SQL Enhancements
  • SQL Server 2005 Web Services
  • SQL Server 2005 XML Capabilites
  • Whidbey, ObjectSpaces and Mexico, Oh My!

    I'm just heading back home from Vancouver tonight on a red eye. When I get home, I have time for a load of laundry or two and then I'm flying to Mexico. Not just for some R&R, but I've also arranged to speak at the Mayan Riviera .NET User Group. (MR. Nug). I'm not sure what is better - going on vacation, or being able to write it off as a business expense. Unfortunately the only Spanish I know is “Dos cervasa por favor“ therefore this talk will be in English. My apologies.

    I'm going to be speaking about the new DataSet that is coming in Whidbey. By the time I get there, beta 2 should be available for download and it should be public knowledge that ObjectSpaces will finally be shipping as part of the whidbey release. I just got a preliminary build last night, so I'm going to try to do a demo of using a dataset with O/R mapping via ObjectSpaces.

    As you may have also heard, bundled into 2.0 Typed Datasets are Typed DataAdapters, also known as TableAdapters. In addition to TableAdapters, beta 2 will introduce typed forms, known as TypedForms. These will be precanned forms for both Web and Windows access that couples an easily painted form right on top of a dataset...with zero code. The cool thing about this is that you can simply add a column to your database table, and this will automatically change your data access layer, your middle tier entity and data entry forms. Likewise, you'll be able to simply drag a text box onto your TypedForms and this will automatically modify your database schema to add a new column to the table.

    There is also some discussion going on about also adding a TypedReport into the dataset that couples Sql Reporting Services Reports directly into your typed dataset class. This stuff is going to be so easy to use, that it is likely going to make it's way into InfoPath as part of the Office product so that end users can create their own data entry forms and reports in InfoPath. This new edition of InfoPath is going to be named “InfoMaker“. More information on this is available here. This new collection of classes in the dataset are now going to be collectively known as a "DataWindow". Again, more details here. No, it's not a snowy day hell. Hard to believe isn't it. This is surely a day to mark on your calendar.

    VSTS Architect's Boot Camp Next Week (Mar 29/31) in Ottawa & Vancouver

    Next week I'm travelling to Ottawa (Tuesday) and then Vancouver (Thursday) to do some boot camp training on Visual Studio Team System. This 1 day hands on, gives folks a chance to play with the new modelling and testing features. I'll also be demoing the project management, process guidance, and integrated source control management features. If you are interested, there are still seats left. Click here for details and here for registration.

    Say goodbye to Whidbey's System.Data.SqlServer namespace

    Pablo Castro writes in the new Data Access Team Blog that they are unifying System.Data.SqlServer classes into System.Data.SqlClient.

    If you are asking “What is System.Data.SqlServer?” the answer is that it's a ADO.NET managed provider for accessing SQL Server 2005 - from within SQL Server 2005 (managed procs, triggers, functions, etc.). Why a different provider? Well SqlClient uses a TDS (tabular data stream) external access protocol to connect to SQL Server. Obviously if you are already in SQL Server you don't want to go outside and back into your database. The SqlServer namespace provides similar classes (SqlConnection, SqlCommand, etc. etc. that work within the current context of your connection to the database.

    In the existing betas, MS was so nice to give them the same class names so that you might have an easy chance of porting data access code from client code to server code by changing the using/Imports <namespace> at the top of your class. You have a little bit of work to tweak your connection (if you really need one), but otherwise this porting process works pretty well. If you want the most efficient code, you also need to break your SqlCommands into static SqlDefinitions and SqlExecutionContexts for the actual executions. You also may need to output differently using the SqlContext's Pipe object.

    So the product team has decided to unify, removing the SqlServer namespace and making the SqlClient components serve dual duty. This magic switch happens simply by using a connection string of “context connection=true”.

    I have mixed emotions about this move. On one hand, it can make the porting process a bit smoother. You don't have to changing your using/Imports and you don't have to remove your connection object. It also means you'll instantiate SqlCommands the same way.

    On the other hand, it's not obvious to me by reading somebody's code if they are talking in-context to the  database, or out of context externally. In fact, this could very well switch based on a configuration setting for the connection string. The differences are not just related to performance, but also affect the transactional semantics. It would be very easy to switch code that is acting in-context as part of a transaction, to an external command that is not participating in a transaction by a simple configuration change. This is a big deal. I don't think transactional semantics should be a “configurable” thing. My opinion extends to the COM+ component catalog, but that is a whole other blog entry.

    This move ends up simplifying an API so much that a programmer's intentions are lost, and that's not a good thing. Pablo says this is coming in SQL Server Beta 3, but he wants your opinions.

    End of VB6 Support, it could be worst.

    As you may have heard, VB6 is coming to it's end of free support shortly. Apparently a bunch of VB6 developers who haven't made the move to .NET are a bit upset. I've declined to comment on the issue as I don't really have much to say one way or the other. But it is interesting to question how MS compares to the competition in supporting their developer tools?

    MS has supported VB6 for 7 years. That is also the current plan with SQL Server 2000.

    IBM you ask? Web Sphere Application Server 3.5 - only support for slightly more than 3 years. They got a bit better with 4.0 supporting it for about 6 months longer than 3.5 - not quite 4 years.

    I'm not a WebSphere expert, but I suppose another argument could be that there was a more direct migration path between those version and the current version 6 as compared to VB6 and VB.NET.

    Back Home exploring SQL Genetics

    Over the past month I've been doing some Whidbey/Yukon training. First in Ottawa, then in Toronto (home) and most recently last week in Calgary - a city I haven't visited in 10 years, and has grown up a lot since then. I really enjoyed Calgary.

    I'm spending the day getting re-acquainted with my wife and our daughters. My 6 year old used a word this morning I hadn't heard her use before. She was showing me a fancy painting she did at school and while comparing her technique to her peers, said that she “used the minimum amount of colors that the teacher suggested”. You know where I'm going with this don't you?

    I asked her what “minimum” meant and she said “the least amount of work”. Ok, she's my daughter, no DNA testing required. But to further test her understanding I asked her what the minimum of 1, 4 and 7 was and she correctly chose 1. She's no math genius, so I was happy she got it right. She even knew that the maximum was 7. I figured I blow her mind (and her mother's who was listening) by asking her what the sum was. I nearly fell off my chair when she said 12.

    So I'm happy to report that SQL skills are indeed part of the human genome. Next week we'll try her on correlated subqueries.

    Getting Schema information from a Database

    Yesterday in my class I was going over the new GetSchema API in Whidbey and I learned from one of my students about another technique that I wasn't aware of - SQL ANSI 92 Information Schema Views. They are also supported in Oracle.

    There are almost too many ways of getting database schema information but I'll try to summarize here to see how they compare. Depending on what you want to do, one of these will be more appropriate for you.

    • A common technique is to execute a SQL statement and describe the result set. Execute a “Select * FROM known_table_name”. Some API's would have you add a “WHERE 1=2“ if you can't just “describe“ the result set without incurring row retrieval payload. You can take advantage of this technique using the DataReader's GetSchemaTable method. In this example, you are really getting the schema of a query and not per se the underlying table structure. As such, it doesn't have to be constrained to a “*“ or 1 table query either.
    • Query the SQL Server System tables - this is close to the metal - which can burn you, but if you don't mind built in fragility. This technique is going to likely become obsolete in SQL Server Yukon with the new “Sys.*“ views.
    • Slightly better than using the system tables would be using system stored procedures such as sp_tables
    • Use the SQL-92 ANSI Standard Information Schema views such as “SELECT * FROM Northwind.INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = N'Customers'. Much nicer, and generic across anything that implements this standard (SQL Server 7.0 and up, and Oracle 9i? and up)
    • In .NET 1.0, you could also use the GetOleDbSchema method of an OleDbConnection. This is somewhat hard coded to a RDBMS and requires an OleDb driver.
    • In .NET 2.0, you can use new GetSchema method on the connection object. In theory, the idea is you get schematic information specific to the provider - so things that are specific to the provider can be made available/discoverable. But I'll warn you, it's a pretty abstract API. It's also remotely similar at the root level to the MSXML 4.0 “GetSchema“ method in that a) the methods have the same name, and b) they accept an argument of “Schema“ to return. In theory, you'll get more accurate schema information for the thing you are connecting too. Collections can be defined by the provider which differs from GetOleDbSchema which normalizes all db objects into a known set of objects. That may not work so well for new providers, Object Relational databases, or who knows what next.

    Temperature User Defined Type for SQL Server Yukon

    In SQL Server 2005/Yukon you can create user defined types using managed code. User defined types can be used in declaring variable and table columns. It is generally considered to only make sense for scalar types - things that can be sorted, compared, etc. not unlike the native numeric types and datetime. This is not however a fixed requirement, but its definitely the primary scenario being addressed by UDTs.

    Here is an example of a temperature type, which is in fact a scalar. All UDT's have to be inserted as strings (i.e. in quotes). Not surprisingly, as part of the contract to implement a user defined type you should have a ToString and Parse method. This example allows the user to insert values in either Kelvin, Fahrenheit, or Centigrade by suffixing the entry with K, F, or C respectively. The default is Kelvin should you not include a suffix for the scheme, so 273, 273K, 0C, and 32F are all equivalent entries. All entries are converted and stored into Kelvin. This makes it easy to have the type byte orderable for easy sorting/comparisons, however this implementation also keeps track of the input format. So if you enter in 0C, you'll get OC back by default. You can however call x.Centigrade, x.Fahrenheit, or x.Kelvin properties to get the measurement scheme you desire. And since the type is byte orderable, you can sort by columns of this type and you'll get something like....

    0K, -272C, 32F, 0C, 273K, 274K, 100C, all ordered by their underlying storage which in Kelvin is.
    0, 1, 273, 273, 273, 274, 373.

    Finally I have also declared as an example a TempAdder user defined aggregate to round out the example. I'm no scientist, but I'm not sure that temperatures are exactly additive, so this is really just for example purposes. The UDA is relatively self explanatory. You are responsible for keeping a variable around to accumulate values into. That leads into the accumulate method which gets called for every item in the result set to be aggregated. There is an init method for initializing your variables and finally a terminate method which is what is called when the result set is finished and you can return the result. The odd method is the merge method - which takes another aggregator. Because of parallel query processing, it is possible for two threads to individually deal with different parts of the entire result set. For this reason, when one thread finishes it's part of the aggregation, it will pass the other aggregator it's aggregator so they can be merged. For this reason I have my interim totals as a public property on the aggregator so i have access to that in the merge method. Furthermore, if you are calculating averages for example you may need the count as well as a public property.

    Bottom line - is that creating user defined types and aggregates is a pretty straightforward. Any complexity will be a result of the actual complexity of your types. Consider a lineage type that must work with multiple measurement schemes (metric and imperial) and also multiple units of measurement (cm, m, km) but also multiple units of measurement at one time (i.e. 4' 10”). Don't even get me started on 2”x4”s which aren't actually 2” by 4”. The nice thing about UDT's is that it allows you to encapsulate a lot of this code deep down in your storage layer, allowing your business applications to deal with things more abstractly.

    using System;
    using System.Data.Sql;
    using System.Data.SqlTypes;

    [Serializable]
    [SqlUserDefinedType(Format.SerializedDataWithMetadata, IsByteOrdered = true, MaxByteSize = 512)]
    public class Temperature : INullable
    {
     
     private double _kelvin;
     private bool n = true;
     private string inputFormat;

     public string InputFormat
     {
      get { return inputFormat; }
      set { inputFormat = value; }
     }

     public double Kelvin
     {
      get { return _kelvin; }
      set
      {
       n = false;
       _kelvin = value;
      }
     }

     public double Centigrade
     {
      get { return (_kelvin -273); }
      set
      {
       n = false;
       _kelvin = value + 273;
      }
     }

     public double Fahrenheit
     {
      get { return (this.Centigrade*1.8 + 32); }
      set
      {
       n = false;
       this.Centigrade = (value - 32)/1.8;
      }
     }
     public override string ToString()
     {
      string s = this.Kelvin + "K";

      if (this.InputFormat == "C")
      {
       s = this.Centigrade + "C";
      }
      else if (this.InputFormat == "F")
      {
       s = this.Fahrenheit + "F";
      }

      return s;
     }

     public bool IsNull
     {
      get
      {
       // Put your code here
       return n;
      }
     }

     public static Temperature Null
     {
      get
      {
       Temperature h = new Temperature();
       return h;
      }
     }

     public static Temperature Parse(SqlString s)
     {
      if (s.IsNull || s.Value.ToLower().Equals("null"))
       return Null;
      Temperature u = new Temperature();
      string ss = s.ToString();

      if (ss.EndsWith("C"))
      {
       u.Centigrade = double.Parse(ss.Substring(0, ss.Length - 1));
       u.InputFormat = "C";
      }
      else if (ss.EndsWith("F"))
      {
       u.Fahrenheit = double.Parse(ss.Substring(0, ss.Length - 1));
       u.InputFormat = "F";
      }
      else if (ss.EndsWith("K"))
      {
       u.Kelvin = double.Parse(ss.Substring(0, ss.Length - 1));
       u.InputFormat = "K";
      }
      else
      {
       u.Kelvin = double.Parse(ss);
       u.InputFormat = "K";
      }

      return u;
     }

    }

    using System;
    using System.Data.Sql;
    using System.Data.SqlTypes;
    using System.Data.SqlServer;


    [Serializable]
    [SqlUserDefinedAggregate(Format.SerializedDataWithMetadata, MaxByteSize = 512)]
    public class TempAdder
    {
     public void Init()
     {
      total = new Temperature();
      total.Kelvin = 0;
     }

     private Temperature total;

     public Temperature Total
     {
      get { return total; }
      set { total = value; }
     }

     public void Accumulate(Temperature Value)
     {
      this.Total.Kelvin  = this.Total.Kelvin + Value.Kelvin;
     }

     public void Merge(TempAdder Group)
     {
      this.Total.Kelvin = this.Total.Kelvin + Group.Total.Kelvin;
     }

     public Temperature Terminate()
     {
      return this.Total;
     }
    }