Excel automatically defines column types

So to set up this problem, an application that I'm currently working on needs to process data that is stored in an Excel spreadsheet. The creation of the spreadsheet is actually performed by a scientific instrument, so my ability to control the format of the output is limited. The instrument samples liquids and determines a quantity. That quantity is placed into the spreadsheet. If the quantity in the liquid is not successfully read, then the literal "Too low" is placed into the sheet. The application opens up the spreadsheet and loads up a DataSet using the OleDb classes (OleDbConnection and OleDbDataAdapter).

This seemed like a fine setup. At least, until some of the numeric values in the spreadsheet were not being read. Or, more accurately, the values in the spreadsheet were not making it into the DataSet. Head-scratching, to say the least.

After some examination, the problem became apparent. When the values were being successfully read, the column in the DataSet had a data type of Double. When the values were not being successfully read, the column in the DataSet had a data type of String. Now the difference between success and no-success was nothing more than the contents of the spreadsheet.

Now the obvious path to follow is how the data type of the column is determine. Some research brought me to what I believe is the correct answer. The Excel driver looks at the contents of the first 8 cells. If the majority is numeric, then the column's data type is set to Double/Integer. If the majority is alphabetic, then the column becomes a string.

Of course, this knowledge didn't really help me. As I said at the outset, my ability to control the format of the spreadsheet was limited. So I needed to be able to read the numeric data even if the column was a string. And, at present, the cells containing numbers in a column marked as a string were returned as String.Empty.

The ultimate solution is to add an IMEX=1 attribute to the connection string. This attribute causes all of the data to be treated as a string, avoiding all of the cell scanning process. And, for reasons which I'm still not certain of, it also allowed the numeric data to be read in and processed. A long and tortuous route yes, but the problem was eventually solved.

Visual Studio and SQL Server 2008 Conflicts

I'm just passing along some information that has been making the rounds (I found it on Guy Burstein's blog).

If you attempt to install SQL Server 2008 on a machine that has Visual Studio 2008 installed, it will fail. The requirement is to have VS 2008 SP1 installed, an update that is still about a week away from release. And you need the 'real' SP1. The same problem exists with the beta for SP1.

So, for you developer/database people out there, it looks like at least a week of waiting to get the combination on a single machine.

SQL CLR Configuration - A Head Slapping Moment

This post is basically a reminder to a future me. The client I'm working with right now (basically an ISV) is using SQL Express as part of their project. And included in their database are a number of CLR stored procedures. In other words, the stored procedures are written in C#. While a little unusual, this is not particularly extraordinary.

The problem arose as I deployed my application from the build server to a freshly installed machine by hand. Moving the binaries was simple (a file copy from a shared directory). Moving the database was nothing more than a backup and restore. But when I ran the application, things didn't work.

It turns out that I had forgotten an important point, that being that the database setting that enabled the CLR would be 'backedup and restored' along with the database. Given that CLR functionality is turned on using sp_configure (sp_configure 'clr Enabled' to be precise), there was no reason for me to make such an assumption. But I did and the result was a moderate debugging session spent trying to figure out why I was stupid...er...forgetful.

Anyway, the point is that not every setting that a database depends on is backed up with the database. Some are associated with SQL Server itself and are therefore not backed up at all. I knew that, but sometimes I forget. :)

ORA-01008 Not All Variables Bound error

I have recently had the opportunity to work (once again) with Oracle. Specifically, I had to create a mechanism that would, based on configurable settings, update either a SQL Server or an Oracle database. In and of itself, this is not particularly challenging. Not since ADO.NET implemented a provider model using the Db... classes that are part of System.Data. The provider name can be used to generate the appropriate concrete instance of the DbConnection class and away you go.

While testing out this capability, I ran into this error when the target data source was Oracle. One would think (and I certainly did) was that I had missed out assigning one of the in-line parameters. The text associated with the error certainly gave that impression. And I was, after all, building the SQL statement on the fly. A bug in my logic could have placed a parameter into the SQL and not created a corresponding DbParameter.

But that was not the case.

Instead, it was that the value of one of my parameters (a string, as it turned out) was null. Not String.Empty, but null. And when you assign a null value to the parameter, it's as if you didn't bind anything to the parameter, the result being that when executing the query, a nice ORA-01008 exception is thrown. The correct way to do the assignment is to set the parameter value to System.DbNull value instead. It would appear that the SQL Server data provider doesn't have this issue, in that the problem only appeared against an Oracle data source. Not a particularly vexing problem, but still it's something to be aware of.

And a couple of years had passed since my last Oracle post ;)

Decrypting Stored Procedures

As part of work that I was doing recently, I had to generate the creation script for a SQL Server database. While running through the Generate Script task in Management Studio, I received an error that indicated that the stored procedure was encrypted. Never having running into encrypted stored procedures before, I was a little surprised. It turns out that encrypted stored procedures can’t be scripted. Could be that they’re encrypted. Duh!

Anyway, the documentation says that, for encrypted stored procedures, it is important to generate the creation script prior to encryption. So I started asking the client if they had the creation scripts. It turns out that a previous DBA had invoked the annoying ‘job security’ clause when creating the stored procedures, so the creation scripts were nowhere to be found.

Some additional research on the problem and I find a third party component that can decrypt the stored procedures. It’s called SQL Decrypter and it’s from Orbital Technologies (http://www.orbitaltechnologies.org/decryptor/). I have to say, it’s a nice tool. Does exactly what it’s supposed to do, quickly, easily and painlessly. So the client gets the generation scription and Orbital gets another (I’m sure) satisfied customer.

Identifying New Records

While the use of AutoNumber and Identity fields in tables is greatly applauded by database designers the world over, they can be annoying for developers to use. The problem being that, when we add records to the database, there are many cases where we need to know the key of the record that has just been added. Fortunately, there are a number of techniques that are available to assist us in our task. In this article we look at the methods for both SQL Server and Access, with an eye to the pros and cons of each approach. As always, our goal is to give you enough information to make up your own mind

SQL Server

SQL Server 2000 offers three, count 'em, three options for retrieving the Id of a recently added record. The 'best' method depends on what you're doing and what else might be going on in your database. In all cases, the Id is a field marked as the primary key of a table and that has the Identity property set to True.

@@IDENTITY This is an intrinsic variable in SQL Server that contains the Id of the record that was last created on this connection, regardless of the table.
IDENT_CURRENT('table') This function returns the ID of the record that was last created in the specified table.
SCOPE_IDENTITY This variable contains the Id of the last record that was created within the current scope, regardless of the table.

So what are the ramifications of using each of these options? With @@IDENTITY, you get the most recent identity on the current connection. That means that if, as part of the INSERT statement, triggers get executed that in turn INSERT records into other tables, @@IDENTITY contains the last Id that was generated. Regardless of the table into which the record was inserted. Now if you created all of the stored procedures, triggers and INSERT statements yourself, that is fine. You can control when the @@IDENTITY value is being checked. But what happens if, sometime in the future, another developer, who is unaware that @@IDENTITY is being used, adds a trigger that creates an Audit log record. All of a sudden, @@IDENTITY returns a different id.  And not the one for the just added record.  As a result, your application breaks even though 'nothing' has changed. And that is the kind of bug that we all just love to track down.

The IDENT_CURRENT function is best used when you can control who has access to the database and when. By specifying the table as part of the function, you can eliminate the issues associated with @@IDENTITY. At least as far as the addition of records in triggers goes,   However IDENT_CURRENT works at the table level, not the connection level.  It returns the most recently created id, regardless of who created it. Unfortunately, in a busy environment, developers can't be sure between the execution of the INSERT command and the retrieval of IDENT_CURRENT, a different record isn't inserted by another user.

The SCOPE_IDENTITY instrinsic variable addresses some of the issues raised with the other two methods. Its value is the last Id created within the current scope. Athough technically the scope is the current connection, practically, the scope is more typically the currently executing stored procedure. Now you don't have to worry about the possibility of future enhancements 'breaking' your code, nor do you have to deal with other activity impacting the Id that is returned. If you perform more that one INSERT in a stored procedure, you do need to use the SCOPE_IDENTITY between each statement in order to retreive all of the created Id's. But again, that is within your sphere of control.

Unfortunately, if you are using SQL Server 7.0 or earlier, then the @@IDENTITY method is the only choice available to you. Both IDENT_CURRENT and SCOPE_IDENTITY were introduced with SQL Server 2000.

Microsoft Access

With Access, you are limited to basically a single technique. On a positive note, the same technique works all the way back to Access 97

First of all, I assume that we are inserting a record into a table where the primary key has an AutoNumber type. The addition of the record must be accomplished by using the AddNew and Update methods of the ADO Recordset object. Then, once the record has been added, store the absolute position of the new record and perform a Requery. Finally, set the cursor back to the bookmarked record and read the Id. Seem like a lot of work? And inflexible to boot? Such is the joy of Access

Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")

cn.Open "DSN=MyDSN;"

rs.CursorLocation = adUseClient
rs.Open "SELECT CustNo, CustomerName, Contact", cn, adOpenStatic, adLockOptimistic


' CustNo is the AutoNumber field
rs.Fields("CustomerName").Value = "ObjectSharp"
rs.Fields("Contact").Value = "Bruce Johnson"

' The record has been inserted, but rs.Fields("CustNo").Value is zero
bookmark = rs.absolutePosition
rs.absolutePosition = bookmark

' Voila
MsgBox rs.Fields("CustNo").Value