Excel automatically defines column types

So to set up this problem, an application that I'm currently working on needs to process data that is stored in an Excel spreadsheet. The creation of the spreadsheet is actually performed by a scientific instrument, so my ability to control the format of the output is limited. The instrument samples liquids and determines a quantity. That quantity is placed into the spreadsheet. If the quantity in the liquid is not successfully read, then the literal "Too low" is placed into the sheet. The application opens up the spreadsheet and loads up a DataSet using the OleDb classes (OleDbConnection and OleDbDataAdapter).

This seemed like a fine setup. At least, until some of the numeric values in the spreadsheet were not being read. Or, more accurately, the values in the spreadsheet were not making it into the DataSet. Head-scratching, to say the least.

After some examination, the problem became apparent. When the values were being successfully read, the column in the DataSet had a data type of Double. When the values were not being successfully read, the column in the DataSet had a data type of String. Now the difference between success and no-success was nothing more than the contents of the spreadsheet.

Now the obvious path to follow is how the data type of the column is determine. Some research brought me to what I believe is the correct answer. The Excel driver looks at the contents of the first 8 cells. If the majority is numeric, then the column's data type is set to Double/Integer. If the majority is alphabetic, then the column becomes a string.

Of course, this knowledge didn't really help me. As I said at the outset, my ability to control the format of the spreadsheet was limited. So I needed to be able to read the numeric data even if the column was a string. And, at present, the cells containing numbers in a column marked as a string were returned as String.Empty.

The ultimate solution is to add an IMEX=1 attribute to the connection string. This attribute causes all of the data to be treated as a string, avoiding all of the cell scanning process. And, for reasons which I'm still not certain of, it also allowed the numeric data to be read in and processed. A long and tortuous route yes, but the problem was eventually solved.