What Makes us Want to Program? Part 4

In my previous post, I started talking about using Microsoft technologies over PHP and open source technologies.  There were a couple reasons why I chose to make the move.  First, from a development perspective, everything was object oriented.  PHP was just getting started with OOP at the time, and it wasn’t all that friendly.  Second, development time was generally cut in at least half, because of the built in controls of ASP.NET.  Third, the end result was a more rich application experience for the same reason.  The final reason comes down to the data aspect.

Pulling data from a database in PHP wasn’t easy to do.  The built in support was for MySQL, with very little, if next to nothing for SQL Server.  In a lot of cases that isn’t always a bad thing.  MySQL is free.  You can’t argue with that.  however, MySQL wasn’t what you would call ACID compliant.  Defined, MySQL did not have the characteristics of being Atomic, Consistent, Isolated, and Durable.  Essentially, when data goes missing, there is nothing you can do about it.  SQL Server on the other hand is very ACID compliant.  This is something you want.  Period.

Once .NET 2.0 was released, a whole new paradigm came into play for data in a web application.  It was easy to access!  No more, or at least next to very little boiler plate coding was necessary for data access now.  Talk about a selling point.  Especially when the developer in question is 16 going on 17.

Now that I didn’t need to worry about data access code, I could start working on figuring out SQL.  At the time t-SQL scared the crap out of me.  My brain just couldn’t work around datasets.  The idea of working with multiple pieces of data at once was foreign.  I understood single valued iterations.  A for loop made sense to me.  SELECTs and JOINs confused me.  Mind you, I didn’t start Statistics in math until the following year.  Did SQL help with statistics, or did statistics help me finally figure out SQL?  It’s a chicken and the egg paradox.

So here I am, 17 years old, understanding multiple languages, building dozens of applications, and attending developer conferences all the while managing my education in High School.  Sweet.  I have 3 years until the next release of Visual Studio comes out.  It was here that I figured I should probably start paying more attention in school.  It’s not so much that I wasn’t paying attention, it’s just that I didn’t care enough.  I put in just enough effort to skate through classes with a passing mark.  It was also at this point in time that I made an interesting supposition.

Experts tend to agree that people who are programming geniuses are also good at math and critical thinking or reasoning.  Not one or the other, but both.  Now I’m not saying I’m a programming genius, but I suck at math.  It was just never in the cards.  But, according to all those High School exams and the psychological profiling they gather from them, my Critical Thinking and Reasoning skills are excellent.  Top 10% in Canada according to the exam results.  My math skills sit around top 20-30% depending on the type.

Neurologists place this type of thinking in the left hemisphere of the brain.  The left brain is associated with verbal, logical, and analytical thinking. It excels in naming and categorizing things, symbolic abstraction, speech, reading, writing, arithmetic.  Those who live in the left brain are very linear.  Perfect for a software developer.

The supposition I made had more to do with the Pre-Frontal Cortex of the brain.  It does a lot of work, some of which is planning complex cognitive behaviors.  Behaviors like making a list, calculating numbers, abstracting thoughts, etc.  It plans out the processes our brains use to get things done.  This is true for both sides of the brain.  So, suppose you are left brain-oriented.  You are predisposed to be good at development.  Now, suppose your Pre-Frontal Cortex is very well developed, more so than the average person.  It could be reasoned that part of being a programming genius is having a well developed Pre-Frontal Cortex.

So why does this make us want to program?  Find out in Part 5.

Open Source Windows

Some days you just have to shake your head and wonder. As it turns out, I'm a little late to hear about this, but nonetheless, I'm still shaking my head.

It turns out that Windows has gone open source. And (!!) it's not being made by Microsoft anymore. Well, Windows™ is still made by Microsoft. Windows is now made by a group under the gise of ReactOS.
ReactOS® is a free, modern operating system based on the design of Windows® XP/2003. Written completely from scratch, it aims to follow the Windows® architecture designed by Microsoft from the hardware level right through to the application level. This is not a Linux based system, and shares none of the unix architecture.
So essentially, these people are taking the Windows architecture (based on XP/2003), and redesigning it from scratch. Or rather, are re-coding it from scratch, because redesigning would imply making something different. Sounds vaguely familiar to, oh, something called Vista. Except uglier.



Now, that nagging feeling we are all getting right about now should be visualized as a pack of rabid lawyers. Isn't this considered copyright infringement? They outright define the product as a copy.

And what about the end users? Are all programs designed to run on Windows supposed to be able to run on this ReactOS? Why bother with testing? The XP architecture is now almost 8 years old by now. That means anything designed to run on Vista, or soon to be designed to run on Windows 7, wouldn't stand a snowballs chance in hell, running on ReactOS.

I would love to see how a .NET application runs on it.

ADO.NET Entity Framework and SQL Server 2008

Do you remember the SubSonic project? The Entity Framework is kind of like that. You can create an extensible and customizable data model from any type of source. It takes the boiler plate coding away from developing Data Access Layers.

Entity is designed to seperate how data is stored and how data is used. It's called an Object-Relational Mapping framework. You point the framework at the source, tell it what kind of business objects you want, and poof: you have an object model. Entity is also designed to play nicely with LINQ. You can use it as a data source when querying with LINQ. In my previous post, the query used NorthwindModEntities as a data source. It is an Entity object.

Entity Framework
Courtesy of Wikipedia

The Architecture, as defined in the picture:

  • Data source specific providers, which abstracts the ADO.NET interfaces to connect to the database when programming against the conceptual schema.
  • Map provider, a database-specific provider that translates the Entity SQL command tree into a query in the native SQL flavor of the database. It includes the Store specific bridge, which is the component that is responsible for translating the generic command tree into a store-specific command tree.
  • EDM parser and view mapping, which takes the SDL specification of the data model and how it maps onto the underlying relational model and enables programming against the conceptual model. From the relational schema, it creates views of the data corresponding to the conceptual model. It aggregates information from multiple tables in order to aggregate them into an entity, and splits an update to an entity into multiple updates to whichever table contributed to that entity.
  • Query and update pipeline, processes queries, filters and update-requests to convert them into canonical command trees which are then converted into store-specific queries by the map provider.
  • Metadata services, which handle all metadata related to entities, relationships and mappings.
  • Transactions, to integrate with transactional capabilities of the underlying store. If the underlying store does not support transactions, support for it needs to be implemented at this layer.
  • Conceptual layer API, the runtime that exposes the programming model for coding against the conceptual schema. It follows the ADO.NET pattern of using Connection objects to refer to the map provider, using Command objects to send the query, and returning EntityResultSets or EntitySets containing the result.
  • Disconnected components, which locally caches datasets and entity sets for using the ADO.NET Entity Framework in an occasionally connected environment.
    • Embedded database: ADO.NET Entity Framework includes a lightweight embedded database for client-side caching and querying of relational data.
  • Design tools, such as Mapping Designer are also included with ADO.NET Entity Framework which simplifies the job on mapping a conceptual schema to the relational schema and specifying which properties of an entity type correspond to which table in the database.
  • Programming layers, which exposes the EDM as programming constructs which can be consumed by programming languages.
  • Object services, automatically generate code for CLR classes that expose the same properties as an entity, thus enabling instantiation of entities as .NET objects.
  • Web services, which expose entities as web services.
  • High level services, such as reporting services which work on entities rather than relational data.

LINQ and SQL Server 2008

No, Zelda is not back.  LINQ stands for Language Integrated Query. It's a set of query operators that can be called in any .NET language to query, project, and filter data from any type of data source. Types include arrays, databases, IEnumerables, Lists, etc, including third party Data Sources. It's pretty neat.

Essentially LINQ pulls the data into data objects, which can then be used as you would use a Business Object. The data object is predefined by a LINQ Provider. Out of the box you have LINQ to SQL, LINQ to XML, and LINQ to Objects for providers. Once you define the data object based on provider you can start querying data:

LINQ


Within the foreach loop the 'Customers' class is a data class that was defined based on a LINQ to SQL Provider. In this case, the database was Northwind.

Syntactically LINQ is very much like the SQL language, mainly because they both work on the same principle. Query (possible) large amounts of data and act on it appropriately. SQL is designed to work with large datasets. Most other languages work iteratively. So SQL was a good language choice to mimic.

However, there is a small problem that I see with LINQ. If I'm doing all the querying at the DAL layer instead of using things like Stored Procedures within the database, and I need to modify a query for performance concerns, that means the DAL has to be recompiled and redistributed to each application out in the field. That could be 10,000 different instances. Wouldn't it make more sense to keep the query within a Stored Procedure? Just a thought...

Protecting Data in Transit between applications and SQL Server

Alright, so you've just implemented Transparent Data Encryption on your database.  Your database is extremely secure.  The data, not so much.  You see, the problem is this: the data travels unencrypted between SQL Server and your application.  Whoops.
 
To enable SSL Encryption on the server side, there are a couple of fairly simple steps involved:
  1. In SQL Server Configuration Manager, expand SQL Server Network Configuration, right-click Protocols for <server instance>, and then select Properties.

  2. In the Protocols for <instance name> Properties dialog box, on the Certificate tab, select the desired certificate from the drop down for the Certificate box, and then click OK.

  3. On the Flags tab, in the ForceEncryption box, select Yes, and then click OK to close the dialog box.

  4. Restart the SQL Server service.

To enable SSL Encryption on the client side:

  1. Copy either the original certificate or the exported certificate file to the client computer.

  2. On the client computer, use the Certificates snap-in to install either the root certificate or the exported certificate file.

  3. In the console pane, right-click SQL Server Native Client Configuration, and then click Properties.

  4. On the Flags page, in the Force protocol encryption box, click Yes.

finally, set your connection string within the application to 'Use Encryption for Data=True'. 

Driver={SQL Native Client};
Server=myServerAddress;Database=myDataBase;Trusted_Connection=yes;Encrypt=yes;
That's really not all that difficult. One more reason to have a more secure infrastructure!>

Microsoft SQL Server 2008 Internals

 
It's almost released!  Only Three and a half months to go!  Pre-Order at Amazon.  

TorontoSql.com, TorontoSql.net, TorontoSql.org just registered

Boredom is a bad thing!  Especially when you are putting off work.  So what do I do to waste my time?  Check out local user groups.  The websites at least.  A few days ago I posted a few links to some promising groups.  To my disappointment there really aren't that many Microsoft oriented user groups in Toronto.  I wouldn't call it a bad thing.  More of an opportunity.
 
I have determined that TorontoSql.com, TorontoSql.net, and TorontoSql.org were not registered.  So for $30 I registered all three of them.  Now I have to put them to good use.  Currently they are pointed to www.syfuhs.net, until I find a proper home.
 
More to come on that front! 

On-Disk Data Structures in SQL Server

Just some links to read.  Carefully.  There's a lot of information.  These posts were made by Paul S. Randal on www.sqlskills.com.  I'll do a more thorough job of weeding out information when I'm not strapped for time.

Toronto and Area User Groups of Interest

Since moving to Toronto I have been looking for user groups that I think I could benefit from.  So far I have found a couple of interest:
 
TSQL.CA - Toronto SQL Server User Group - http://www.tsql.ca/Default.aspx?base
Toronto .NET User Group - http://www.torontoug.net/
ISSA Toronto (Information System Security Association) - http://www.issa-toronto.org/ 
 
I'm still looking, but these look promising.  
 

Floating down the FILESTREAM...

The old CCR lyrics popped into my head "Rollin' down the river..." and then of course I realized a river isn't a stream, nor is it any way related to data streams, and I lost my concentration for 20 minutes-ish.  What does this have to do with FILESTREAMS?  Nothing really.
 
Let's talk a little about Filestreams in SQL Server 2008.  So, what is FILESTREAM?  According to Microsoft, FILESTREAM is a technology "which allows storage of and efficient access to BLOB data using a combination of SQL Server 2008 and the NTFS file system".  Thats all fine and dandy, but what does it really mean?  FILESTREAM, very eloquently put, stores large BLOBs of data on the file system.
 
In previous versions of SQL Server, you didn't have all that much choice when it came to working with large BLOB data.  Your options were three-fold.
 
1. Store it in the database.
2. Store it on the file system.
3. Store it in a BLOB store.
 
Storing the data in the database was tricky.  If the data was small in size, it worked out fairly well.  But in instances where you had to store large pieces of data, such as videos, or audio files, keeping the data within the database was a major blow for performance. 
 
So what is the next best option?  Store the data on the file system, and leave a pointer in the database.  You didn't have performance issues, and data was streamed directly from the file system.  However, you did have problems with data consistency.  You didn't benefit from the transactional properties of SQL Server.  Backup's were a pain, and managing the streaming was tricky from a programmatic perspective.
 
Next up, and the final option was a BLOB store.  Basically, it works much like the filesystem approach, except it was more manageable, and had transactional characteristics due to a tie in with the database.  An example is the EMC Centera.  Drawback: Fricken EXPENSIVE.  Worked great though... 
 
 Now onto SQL Server 2008.  FILESTREAM works very much like option number 2, has transactional benefits like option number 1, and is cheaper than option number 3.  How does it do it?  It's actually pretty simple.  A column of VARBINARY(MAX) is created in a table, with an AS FILESTREAM attribute.  Once data is stored within the column, a piece of metadata tells SQL Server to write the file to disk, and stick a pointer in the column.
 
So far that takes care of replacing option number 2.  So how is transactional consistency dealt with?  Easy.  NTFS as a file system is transactional by nature, so FILESTREAM takes full advantage.  Backing up is dealt with the same way as all databases are, and is done transparently.  That takes care of option number 1.
 
Now comes the issue of price.  Well, it turns out FILESTREAM is available for free.  It is built in, starting with SQL Express.  That's all I have to say about that, which takes care of option number 3.
 
That in a nut shell is SQL Server FILESTREAM.  My next post will be on how to actually implement FILESTREAM on a 2008 database.