End of VB6 Support, it could be worst.

As you may have heard, VB6 is coming to it's end of free support shortly. Apparently a bunch of VB6 developers who haven't made the move to .NET are a bit upset. I've declined to comment on the issue as I don't really have much to say one way or the other. But it is interesting to question how MS compares to the competition in supporting their developer tools?

MS has supported VB6 for 7 years. That is also the current plan with SQL Server 2000.

IBM you ask? Web Sphere Application Server 3.5 - only support for slightly more than 3 years. They got a bit better with 4.0 supporting it for about 6 months longer than 3.5 - not quite 4 years.

I'm not a WebSphere expert, but I suppose another argument could be that there was a more direct migration path between those version and the current version 6 as compared to VB6 and VB.NET.

Getting Schema information from a Database

Yesterday in my class I was going over the new GetSchema API in Whidbey and I learned from one of my students about another technique that I wasn't aware of - SQL ANSI 92 Information Schema Views. They are also supported in Oracle.

There are almost too many ways of getting database schema information but I'll try to summarize here to see how they compare. Depending on what you want to do, one of these will be more appropriate for you.

  • A common technique is to execute a SQL statement and describe the result set. Execute a “Select * FROM known_table_name”. Some API's would have you add a “WHERE 1=2“ if you can't just “describe“ the result set without incurring row retrieval payload. You can take advantage of this technique using the DataReader's GetSchemaTable method. In this example, you are really getting the schema of a query and not per se the underlying table structure. As such, it doesn't have to be constrained to a “*“ or 1 table query either.
  • Query the SQL Server System tables - this is close to the metal - which can burn you, but if you don't mind built in fragility. This technique is going to likely become obsolete in SQL Server Yukon with the new “Sys.*“ views.
  • Slightly better than using the system tables would be using system stored procedures such as sp_tables
  • Use the SQL-92 ANSI Standard Information Schema views such as “SELECT * FROM Northwind.INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = N'Customers'. Much nicer, and generic across anything that implements this standard (SQL Server 7.0 and up, and Oracle 9i? and up)
  • In .NET 1.0, you could also use the GetOleDbSchema method of an OleDbConnection. This is somewhat hard coded to a RDBMS and requires an OleDb driver.
  • In .NET 2.0, you can use new GetSchema method on the connection object. In theory, the idea is you get schematic information specific to the provider - so things that are specific to the provider can be made available/discoverable. But I'll warn you, it's a pretty abstract API. It's also remotely similar at the root level to the MSXML 4.0 “GetSchema“ method in that a) the methods have the same name, and b) they accept an argument of “Schema“ to return. In theory, you'll get more accurate schema information for the thing you are connecting too. Collections can be defined by the provider which differs from GetOleDbSchema which normalizes all db objects into a known set of objects. That may not work so well for new providers, Object Relational databases, or who knows what next.

Not your father's NGEN

Mr. Reid “NGEN“ Wilkes has a good article in MSDN magazine about NGEN 2.0 coming in whidbey. NGEN is the command-line tool that pre-compiles MSIL into native PE code....which has been around since .NET 1.0. It's can be a good performance kick on large applications to NGEN your assemblies as part of your deployment.

So what are the highlights?

  • when a dependent shared assembly is updated, your executable's ngen cached image is invalidate. The new NGEN service can queue up requests to do “across the board“ re-ngen's to update your cached image. To accomplish this, NGEN keeps track of all of these dependencies so when an update is deployed - bam - things are queued up and recompiled when your machine has idle time. way cool.
  • NGEN 2.0 blows out your generic parameterized classes into pre-compiled code (almost 100% of the time). Therefore any performance drags on generic expansion at runtime are gone if you NGEN.

Reid makes some good arguments for taking the time to NGEN your code by describing the down side of the JITer and the overhead that it imposes on memory compared to loading native images. What scares me even more is all the dynamic compilation coming in ASP.NET applications. Isn't this going to preclude the using NGEN on ASP.NET applications in whidbey?

But alas, these performance improvements are not always a silver bullet and in some cases JITing is more performant. The bottom line - is test to make sure your assumptions of NGEN improving your performance are correct.

VB6 Support Coming to an End

Courtesy of Julia Lerman, I am reminded that mainstream support for VB6 comes to an end on March 31.  Now contrast that fact with the results of a survey that found over 50% of shops are still using VB6, a fact I blogged about here.  Does anyone else see a problem with this picture?

The problem of telling people who are currently running VB6 why they might want to move their applications has been at the front of my brain for a little while.  The biggest roadblock is that there really is no direct compelling reason.  Put it to you this way. You have an application currently working in VB6.  It took a couple of man years to build, but everyone is comfortable with it.  Both your developers and users can use and extend it as they need to.

Along comes someone saying that the company needs to convert to VB.NET.  This is a working application, not a broken one.  There is a cost to doing the conversion, not just in the effort, but in the efficiency of the development team.  As someone who made the move from VB6 to VB.NET about 4 years ago, I can tell you I was quite frustrated by how much my productivity dropped for the first few weeks (or months).  And what does the company get on the other side?  An application that (hopefully) works. Not really the ROI that one would hope for.

The problem is that the benefit to moving to the .NET platform comes about in the future.  A well designed .NET application is much easier to extend.  Using widely accepted software design processes (like unit testing and FXCop) are more easily accomplished using .NET.  Meaning that the quality of the software produced increases. So over the life of an application, the company will definitely make back its investment.  Or, more accurately, is quite likely to make it back.  But the argument is a difficult one to make to an executive with their eye on the financials of a company.  Invest potentially tens and hundreds of thousands of dollars for what return? 

I'm still thinking about how to crack this nut.  Any suggestions that you have or stories that you've run into would be helpful, so feel free to send them on.  And I'm sure I'll come back to this in a later post.

VSTS Work Item: Percentage Completed

During one of my demos this past week on VSTS, somebody commented that typically developers want to provide project managers more information on a task other than “completed” or not. What they really want is a percentage complete. John Lawrence on the team at MS developing this stuff. The good news is that they have reworked the work items a bit, and now there is support for fields indicating “Completed Work” and “Remaining Work”. These numbers will synchronize with “percentage complete“ in MS Project.

I'm happy to see this as it also ties in with better metric tracking. We don't just want to know that an item was completed, but how much work it required - which maybe different than the estimate. All to often item “estimates” turn into “budgets”. This is one small way that projects often take longer. Developers often don't report that a task took less than the estimate - only more or the same as the estimate. This is one of the reasons why I like estimates that aren't time-based but effort based. In other words, we esimate an item in terms of some arbitrary scale - 1-5 let's say where 1 is easy and 5 is hard. Project managers can figure out what those numbers mean later on and do things like calculate team velocity.

More Class Designer Productivity Potential: Batch Editing.

Daniel Moth says that he's not excited about the properties box in the class designer and would prefer to use the code editor to make those kinds of changes. It may not be obvious but one of the things you can do with that properties pane that you can't do in the code editor is make multiple changes across several class or several members at the same time.

Select all of the items that you want to make a mass change to, and any common properties are show in the properties dialog. I find this useful for decorating properties of my own components with custom attributes. Perhaps I want to change a bunch of methods to Static.

Daniel mentions another limitation. There is no full signature support on the model surface in the class designer. This makes it impossible to see the differences between your overloads. In fact, overloads are all grouped together and a count is shown.

Another mass editing scenario would be to change the XML comments on a bunch of methods - for example several overloads. You can't see the individual overloaded methods - just one of them with a “+1 overloads“ next to them. Furthermore, when you change the comment for a method that is overloaded (and shown as “+2 overloads“) one would hope the comment would be applied to all of the overloads, however the comment is only applied to the first one. Hopefully this is a bug and will be fixed. I've logged it with MS in the Product Feedback Center.

Installing Visual Studio Team System Dec CTP

I'm getting a fair amount of questions about this topic lately so worth a blog entry.

The best way to install any beta (or even more so CTP's) is to use Virtual PC. This will save you from having to reformat your entire machine a few times. I don't think I've ever know a VS.NET beta release that uninstalled properly.

So if you are going to use Virtual PC - the best way to get started is to find a friend who has already done the install successfully and get them to give you a copy of their Virtual Machine's.

In general with VPC's, you get better performance if the VHD files are located on a drive other than what your host OS is installed on. If you have a 2nd internal drive, great, otherwise, a good 7200 RPM external USB 2.0 drive will give a good performance boost. You'll also get best performance if you don't use a differential drive or an undo disk. To save memory and CPU cycles, turn off any unessential services and running programs in both the host and guest operating systems.

Yes, you need 2 machines for VSTS - one for the server/data tier, and a second for the client. You can't currently install everything on one box - that is not a supported scenario - at least for now. Your server should also be a domain controller. Unless you have 2 GB of ram, you'll likely want to host each one of those VPC's on a separate box. I've had good results having the server/data tier hosted in Virtual Server. You also have no real need to log in/have a UI open for the server box once it's all installed and configured.

There is a good document here with more detailed installation instructions: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvs05/html/vstsinstallguide.asp

Having said all of that, the next beta is due out this month or early next month so you might want to wait to get a much better experience. As always, keep in mind that beta's are flaky and CTP's are worst than that.

Temperature User Defined Type for SQL Server Yukon

In SQL Server 2005/Yukon you can create user defined types using managed code. User defined types can be used in declaring variable and table columns. It is generally considered to only make sense for scalar types - things that can be sorted, compared, etc. not unlike the native numeric types and datetime. This is not however a fixed requirement, but its definitely the primary scenario being addressed by UDTs.

Here is an example of a temperature type, which is in fact a scalar. All UDT's have to be inserted as strings (i.e. in quotes). Not surprisingly, as part of the contract to implement a user defined type you should have a ToString and Parse method. This example allows the user to insert values in either Kelvin, Fahrenheit, or Centigrade by suffixing the entry with K, F, or C respectively. The default is Kelvin should you not include a suffix for the scheme, so 273, 273K, 0C, and 32F are all equivalent entries. All entries are converted and stored into Kelvin. This makes it easy to have the type byte orderable for easy sorting/comparisons, however this implementation also keeps track of the input format. So if you enter in 0C, you'll get OC back by default. You can however call x.Centigrade, x.Fahrenheit, or x.Kelvin properties to get the measurement scheme you desire. And since the type is byte orderable, you can sort by columns of this type and you'll get something like....

0K, -272C, 32F, 0C, 273K, 274K, 100C, all ordered by their underlying storage which in Kelvin is.
0, 1, 273, 273, 273, 274, 373.

Finally I have also declared as an example a TempAdder user defined aggregate to round out the example. I'm no scientist, but I'm not sure that temperatures are exactly additive, so this is really just for example purposes. The UDA is relatively self explanatory. You are responsible for keeping a variable around to accumulate values into. That leads into the accumulate method which gets called for every item in the result set to be aggregated. There is an init method for initializing your variables and finally a terminate method which is what is called when the result set is finished and you can return the result. The odd method is the merge method - which takes another aggregator. Because of parallel query processing, it is possible for two threads to individually deal with different parts of the entire result set. For this reason, when one thread finishes it's part of the aggregation, it will pass the other aggregator it's aggregator so they can be merged. For this reason I have my interim totals as a public property on the aggregator so i have access to that in the merge method. Furthermore, if you are calculating averages for example you may need the count as well as a public property.

Bottom line - is that creating user defined types and aggregates is a pretty straightforward. Any complexity will be a result of the actual complexity of your types. Consider a lineage type that must work with multiple measurement schemes (metric and imperial) and also multiple units of measurement (cm, m, km) but also multiple units of measurement at one time (i.e. 4' 10”). Don't even get me started on 2”x4”s which aren't actually 2” by 4”. The nice thing about UDT's is that it allows you to encapsulate a lot of this code deep down in your storage layer, allowing your business applications to deal with things more abstractly.

using System;
using System.Data.Sql;
using System.Data.SqlTypes;

[Serializable]
[SqlUserDefinedType(Format.SerializedDataWithMetadata, IsByteOrdered = true, MaxByteSize = 512)]
public class Temperature : INullable
{
 
 private double _kelvin;
 private bool n = true;
 private string inputFormat;

 public string InputFormat
 {
  get { return inputFormat; }
  set { inputFormat = value; }
 }

 public double Kelvin
 {
  get { return _kelvin; }
  set
  {
   n = false;
   _kelvin = value;
  }
 }

 public double Centigrade
 {
  get { return (_kelvin -273); }
  set
  {
   n = false;
   _kelvin = value + 273;
  }
 }

 public double Fahrenheit
 {
  get { return (this.Centigrade*1.8 + 32); }
  set
  {
   n = false;
   this.Centigrade = (value - 32)/1.8;
  }
 }
 public override string ToString()
 {
  string s = this.Kelvin + "K";

  if (this.InputFormat == "C")
  {
   s = this.Centigrade + "C";
  }
  else if (this.InputFormat == "F")
  {
   s = this.Fahrenheit + "F";
  }

  return s;
 }

 public bool IsNull
 {
  get
  {
   // Put your code here
   return n;
  }
 }

 public static Temperature Null
 {
  get
  {
   Temperature h = new Temperature();
   return h;
  }
 }

 public static Temperature Parse(SqlString s)
 {
  if (s.IsNull || s.Value.ToLower().Equals("null"))
   return Null;
  Temperature u = new Temperature();
  string ss = s.ToString();

  if (ss.EndsWith("C"))
  {
   u.Centigrade = double.Parse(ss.Substring(0, ss.Length - 1));
   u.InputFormat = "C";
  }
  else if (ss.EndsWith("F"))
  {
   u.Fahrenheit = double.Parse(ss.Substring(0, ss.Length - 1));
   u.InputFormat = "F";
  }
  else if (ss.EndsWith("K"))
  {
   u.Kelvin = double.Parse(ss.Substring(0, ss.Length - 1));
   u.InputFormat = "K";
  }
  else
  {
   u.Kelvin = double.Parse(ss);
   u.InputFormat = "K";
  }

  return u;
 }

}

using System;
using System.Data.Sql;
using System.Data.SqlTypes;
using System.Data.SqlServer;


[Serializable]
[SqlUserDefinedAggregate(Format.SerializedDataWithMetadata, MaxByteSize = 512)]
public class TempAdder
{
 public void Init()
 {
  total = new Temperature();
  total.Kelvin = 0;
 }

 private Temperature total;

 public Temperature Total
 {
  get { return total; }
  set { total = value; }
 }

 public void Accumulate(Temperature Value)
 {
  this.Total.Kelvin  = this.Total.Kelvin + Value.Kelvin;
 }

 public void Merge(TempAdder Group)
 {
  this.Total.Kelvin = this.Total.Kelvin + Group.Total.Kelvin;
 }

 public Temperature Terminate()
 {
  return this.Total;
 }
}


 

Modeling as a Productivity Enhancer in Visual Studio Team System

It's been a theme for me over the past couple of weeks where people have mentioned that they can't afford the time to do modeling. If you've done a lot of UML modeling, you know what I'm talking about. But it doesn't have to be that way. Now just to ward of the UML flames, even when UML modeling seems like a waste of time, it may actual pay for itself if you uncover a requirement or a design flaw that you wouldn't have otherwise. Certainly catching this kind of thing early than later pays for itself. But that's not what I'm talking about.

In my days as an ERwin/ERX user, doing data modeling was done in this tool not only because it was good to visualize something quickly before committing to code. Using ERwin/ERX was just plain faster than cutting DDL code manually - or heck, even using the Database diagramming in SQL Server. One simple feature was foreign key migration - you drew a relationship from parent to child and bam - it copied down the pk from the parent table and set it up as a fk on the child table.

For the VSTS class designer to be successful (or any of the designers for that matter) MS needs to make using them just plain faster than stubbing out the code manually. Visual Studio has a great editor so they have their work cut out for them. It gets even better with things like code snippets. Why not enable some of the code snippets in the class designer? I can still create a single field wrapped up by a public property faster in the code editor using a snippet (in C# “prop“) than using the class designer - but I don't see why they couldn't add support for that in the class designer too.

A feature I stumbled upon last week was the ability to override a member. Simply right click on the descended class, and select “Override Member“ and you'll see a list of members from the ancestor that are overridable. Select the member to override and bam - you have the code stub. This reminds me a bit of the Inherited Class Skeleton Generator. This represents the kinds of productivity features that can make models/designers more usable, even if just for productivity sake.

There are obviously some types of edits that are better performed in the code editor, such as actually editing code fragments. Other types of edits can be performed more appropriately in the model such as stubbing out the api/members of a class, overriding members, etc. Let's not forget the other types of edits which are much better done in a WYSIWYG designer such as the windows or web forms designer.

One thing I'd like to see come in the Class Designer is a flattened out view of a class that collapses all inherited members into one layer. I'll call this the consumer or intellisense view of a class. It's helpful for viewing the entire interface to a class so I can see how the whole thing makes sense to a user. I would propose a greyed out notation or perhaps a lock icon or something similar to the online help view of a class.

Copy & Paste Support in VSTS Class Designer/Whitehorse

You may have noticed that the refactoring menu's that you see in the code editor are also available in the Class Designer. Furthermore, you can also copy & paste things from one class to another. So if you copy a property from one class to another, not only do you get the property added to the class, but also all the code in your getter's and setters. Ok so this is a nice touch that can help with a refactoring effort that requires moving stuff around - but don't consider this code reuse :)