BizTalk, Another Custom XSLT mapping sample.

The below mapping example could most likely be accomplished using a series of BizTalk maps. In the maps,  Table Looping / Table Extractor / Looping functoids etc. could be used to perform the transformation. Sometimes it makes sense to craft your own custom XSLT to execute the complete transformation. The sample input / output messages for the transformation as below. Also included is the custom XSLT for the transformation. You can also download the source code (at end of this blog entry). 

INPUT Message:

--> For each "contenedor" node, in the output message must repeat the sequence of segments ORC + RQD (According to the standard HL7 v.2.3.1)
--> For each "prueba" node, in the output message must repeat the sequence of segments ORC + OBR
--> For each "parametro" node, in the output message must repeat the segment NTE

For the below output, the order of the nodes is also important. The order relates back to the input message.

OUTPUT Message should be:

MSH||||.|
ORC||||.|
RQD||||.|
ORC||||.|
RQD||||.|
ORC||||.|
OBR||||.|
NTE||||.|
NTE||||.|
ORC||||.|
OBR||||.|
NTE||||.|
NTE||||.|
NTE||||.|


The custom XSLT for the mapping/transformation looks like below:

The above XSLT can be placed in a scripting functoid in the map as discussed HERE
Or the custom XSLT can be placed in a separate file, with the map's property ->  Custom XSLT Path pointing to the file (see below)

Download the solution HERE

Conclusion: In some cases it is much simpler to write your own custom XSLT for a BizTalk map.
Note: You can also perform a transformation in a BizTalk map using a combination of custom XSLT and links/functoids (see below links)
Note: Using functoids or a combination of functoids in a map are sometimes not enough to perform a transformation or a portion of a transformation. Using custom XSLT for the transformation or a portion of the transformation may be the answer.

Below is more information on using custom XSLT in BizTalk maps:

Extending Mapper (BizTalk Server SDK Samples)

Creating Nodes from Flat Data in BizTalk 2004 Maps

Logical Message Splitting in BizTalk Server 2004

Implementation of Message Transformation Normaliser Pattern in the BizTalk 2004

Sorting XML Nodes with the BizTalk Mapper

Debugging XSLT

New BizTalk 2006 5 day Course -> Building BizTalk 2006 Solutions Best Practices

I am in the process of writing a new BizTalk 2006 course as below:

Preamble

BizTalk is all about messaging and how messages are processed and routed between a companies internal applications (for example: Sap/Mainframe/Peoplesoft/ Database/JD Edwards/Web Services etc.) and how messages are exchanged with trading partners (B2B). BizTalk provides an excellent framework for implementing many common messaging patterns such as:
a) Scatter/Gatherer pattern
b) Splitter pattern
c) Aggregator pattern
d) Message Broker pattern
e) etc.
The objective of this course is to provide students with all the necessary tools to implement messaging patterns using BizTalk 2006.

Below are some of the topics with labs for the course:

a) Various methods to Create Schemas for XML and Flat files messages.
b) Mapping using the BizTalk Mapper, custom XSLT and other alternatives
will be discussed to perform complex transformations.
c) An introduction into building Custom Adapters
d) An introduction into building Custom Pipeline Components
e) Interacting with the BizTalk Messaging Engine
f) Creating and using Custom Pipelines
g) Orchestrations.
h) Using xpath in Orchestrations.
i) Web Services and BizTalk
j) Error Handling with BizTalk
k) Debugging BizTalk Solutions
l) Deploying BizTalk Solutions
m) Business Activity Monitoring and BizTalk
n) Business Rules Engine
o) Implementing Common Enterprise Integration Patterns with BizTalk.
Some of the patterns discussed and put into practice are:
Message Broker
Content Based Routing
Splitter Pattern
Normalizer Pattern
Aggregator Pattern
 
Additionally the new features of BizTalk 2006 will be highlighted and used in Labs.

It is quite a bit for a five day course (not sure if all of it will make it in). From my experience using the product and teaching a BTS 2004 course for the past year and a half, the above is what I would like to cover in the course.

If you have a suggestion for additional topics for the course, please reply to this entry. I would love to hear them.

Update: This course is available now. Building BizTalk Server 2006 Solutions Best Practices

 

BizTalk, MIB, Acord and 5 minutes of Marketing

My blog entries are usually technical in nature. This entry is more of a marketing entry discussing how BizTalk 2004/2006 can be used to implement the following specific business scenario:

1) An insurance underwriter requires information (for example -> medical history) about an insurance applicant.

2) An internal application captures applicant information provided by the underwriter such as:
a) Applicants First Name/Last Name/ Middle Name.
b) Birth date.
c) Birth place.

3) The internal application then sends this information to a company called MIB. The underwriters query parameters (Last Name/ First Name etc) are then used to query a data repository at MIB. A result set of the query is then returned to the internal application from MIB. This result set contains information such as the medical history of the applicant.

4) The results from MIB are then displayed to the insurance underwriter to aid in the insurance application process.

Presently many North American insurance companies send/receive information to/from MIB over a dial up connection. Flat Files are sent and received back from MIB.

By December of 2006, all users of the current dial up process are to switch over to a new process provided by MIB as below:

All insurance underwriter requests are to be transmitted to MIB using a secure HTTP connection. The requests are to be sent in XML messages that conform to the Acord schemas. A synchronous HTTP response (XML) message will then be sent back over the same secure HTTP channel as the request. This response message contains the results of the query.

Below is a description of how the above new process can be implemented by BizTalk 2004 or BizTalk 2006.

1) Create a new BizTalk project that contains the Acord XSD schemas as below:


2) Typically a mainframe application may dump the underwriter requests to disk in a flat file or different format.
A configured BizTalk FTP or File Receive Port is then configured to pick up the files.
 
Note: The original request may be delivered by a different transport or stored to intermediate location as below:


HTTP
MSMQ
MQSeries
Sql Server
Oracle
DB2
Host Applications  -> IBM, mainframe zSeries (CICS and IMS) and/or midrange iSeries (RPG)
Host Files -> Data adapter for host file systems on IBM mainframe zSeries VSAM datasets and midrange iSeries physical files
etc.


BizTalk contains many out of the box adapters that can be used to receive messages from all the above transports/locations and others not listed. If the needed adapter is not provided out of the box, then the BizTalk Adapter framework can be used to author your own custom adapter. Third party adapters can also be purchased.

3) Create internal XSD schemas to represent the internal messages.
For example, an XSD schema using flat file extensions would be created  to represent flat files produced by the mainframe application.

4) Create BizTalk map(s) to transform the internal flat files to Acord XML messages.

5) Create the necessary BizTalk orchestrations for the business process. 

6) Install a certificate on the BizTalk machine for the MIB secure HTTP requests using the Certificate Snap-in.

7) Create a BizTalk Solicit-Response Send port with a configured HTTP adapter for the MIB secure HTTP requests. The Send port is configured to use the installed certificate from step 6).

8) Create BizTalk map(s) to transform the Acord XML response messages to internal messages.

The above pretty well covers all the required BizTalk objects.

So why use BizTalk for the above MIB request/response loop or a similar business process:

1) This is what BizTalk server does. It integrates a companies internal applications together and also provides B2B capabilities.

2) A number of turnkey solutions can be purchased that do the same communication loop with MIB. These turnkey solutions are probably more expensive and proprietary than a BizTalk Solution. There is little coding involved to build the above BizTalk solution and there is complete control over the process as a developer(s) is building it and maintaining it. For changes or modifications to the BizTalk process, a configuration change is sometimes all it takes.

For BizTalk 2006 pricing go to HERE
For BizTalk 2004 pricing got to HERE

I have heard claims that it took more than 1000 man hours to complete the above process in BizTalk. This is completely untrue as it took me a fraction of 1000 hours to implement the above MIB communication loop using BizTalk.

3) The BizTalk server that hosts the above process can be used to host other developed Application Integration or
B2B business processes.

4) BizTalk server is increasing exponentially in popularity. More and more developers have BizTalk experience.

5) BizTalk provides many useful out of the box features such as:

a) Business Activity Monitoring
b) Rules Engine
c) Fault Tolerance/Load Balancing
d) Tracking
e) etc. etc.


Note: I do not mean to sugar coat the process. The BizTalk maps for the MIB requests and response messages are somewhat complex. The complexity of the maps can be attributed directly to the large size of the Acord XSD schemas. There are larger XSD schemas. In my opinion, the HL7 XSD schemas are more complex then the Acord XSD schemas.

Vancouver Launch of Visual Studio and SQL Server 2005

I'm in British Columbia for a few days for the Vancouver stop of the Canadian Launch of Visual Studio and SQL Server 2005. They are expecting a great turnout - should be one of the largest MS events in town in recent memory. Last night we had a User Group reception and I got a chance to meet some of the local community leaders and technorati.

  • Rob Chartier is a smart guy who is going to be working at the Ask the Experts Cabana area as well. He is also working on a Code Camp out here in Vancouver on March 18th. Registration is now open.
  • Shaun Walker of DotNetNuke open source portal fame was also there. We had some interesting conversations around the challenges of managing an open source project. I was happy to hear how commercially successfully they were as well.
  • My friend Mike Flasko was also up from Redmond. He is now the Program Manager for the System.Net team and is doing well in his new role at Microsoft. They have some pretty exciting stuff in the works for Orcas and beyond. He also has an open call on his blog for feedback on what you want in Orcas for System.NET.
  • I also had a chance to meet Graham Jones who runs Vantug out here.

All in all in was a fun evening. Ilya Bukshteyn is up from Redmond to do the Keynote presentation which I'm looking forward too. John Bristowe and Ilya are sure to have some lively banter during the demos.

Upgrading to WSE 3.0

If you weren’t aware, the version of WSE 3.0 built against the release version of Visual Studio was released on the Nov 7. I’m sure that it would have received launch parties and wide mention in blogs were it not for the conflicting launch of Visual Studio .NET, SQL Server 2005 and BizTalk 2006. :)

If you did any work with the CTP versions of WSE or if you are upgrading from WSE 2.0, there is one thing you should definitely be aware of. Specifically, I had created a WSE policy that use Anonymous over Certificate security. It was working in the CTP version. Then I upgraded to the release version. All of a sudden, I received the following error message on the client side:

WSE910: An error happened during the processing of a response message, and you can find the error in the inner exception. You can also find the response message in the Response property.

The inner exception was an InvalidOperationException with a Message of:

Security requirements are not satisfied because the security header is not present in the incoming message.

After a fair bit of digging, I determined that this message is actually a “can’t validate the user’s credentials” message. But the real cause (because, after all, I was still sending the same credentials that I did prior to upgrading) was a little more subtle. In order for the WSE functionality to work in 3.0, a new element needs to be added to the web.config of the service.

<system.web>
   <webServices>
      <soapServerProtocolFactory type="Microsoft.Web.Services3.WseProtocolFactory, Microsoft.Web.Services3, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
/>
   </webServices>
</system.web>

When this tag gets added, all of a sudden the message went away. To be fair, this tag is documented as one of the changes between WSE 2.0 and 3.0. What surprised me is that it appears to also be one of the changes between the earlier CTP versions of WSE. Hopefully this post will help others avoid my not-so-brief struggle.

Migrating ASP.NET 1.x sites to ASP.NET 2.0

One of the questions we got during the Q&A of the Ottawa VS Launch yesterday was around problems in migrating ASP.NET applications from 2002/2003 to 2005.

The Web Platform Team has put together a nice step-by-step guide that covers some best practices to ensure a successful migration effort which should take you “the better part of a day” according to them. http://msdn.microsoft.com/asp.net/default.aspx?pull=/library/en-us/dnaspp/html/webprojectsvs05.asp

Hello 2.0

This past week we saw the final bits of SQL Server 2005 and Visual Studio 2005 get shipped up to MSDN Subscriber Downloads. Next week we'll see the official launch of these same products to the rest of the world, ushered in with a rolling thunder of launch events and parties stretching into the rest of the month and beyond. Microsoft does a great job of fostering community with events like this.

Technically there is a lot to like about the updates to the platform and I share most of Joel's top picks. I've been building applications, consulting and teaching developers on this platform for 4 years now and it feels quite legacy, if not common place, to me now. However, in many peoples' eyes, this becomes a critical moment in time: .NET is no longer a 1.0 product. Of course I'm speaking about groups who are not developing anything significant in .NET today, and with this maturity milestone, allows them into this “new“ world.

We've been watching the adoption and market maturity of .NET closely for the past few years, and a bit to my surprise I'm starting to see a lot of groups come to .NET for the very first time with 2.0.

This coming Tuesday I have the great pleasure of being involved in the ushering in of this new era at the Toronto launch where we are expecting between 3,000 and 4,000 developers and IT professionals come together. Early statistics are showing that somewhere between 35-50% of these folks are new to .NET. Similar events are taking place all over the world during this week and stretching out into December and beyond. For Canada, Toronto is just the first stop in a long list of cities from coast to coast. Personally, I'll be presenting at Toronto, Ottawa, Vancouver, Montreal, Quebec City and Halifax.

The overwhelming registration statics tics in all cities tells me two things: Firstly that .NET 2.0 is going to be adopted very quickly. Secondly, and more importantly, is that the software development industry in Canada is vibrantly growing and that indeed....Software Matters!

Consider two things:

  • Software costs a lot of money to design, build, test and deploy. Much more than it should.
  • Software projects fail at an alarming rate. Failure can be defined as any of the following: Late, Over Budget, Under Functionality, Buggy, Doesn't meet requirements.

Yet despite these two glaring issues, the business value of software is so compelling, that people are willing to keep investing in building software at increasing rates.

And then there is Visual Studio and SQL Server 2005:

  • One of ASP.NET 2.0's design goals was to reduce the number of lines of code in a typical application by over 50%.
  • SQL Server 2005 has been enhanced to be more reliable and secure, while at the same time bringing the 4GL productivity associated with C#, VB.NET and the .NET Framework into the database engine itself.
  • Visual Studio Team System 2005 was built from the ground up to help project's stay on track by integrating developers, architects, testers, project managers and other stakeholders into a common extensible repository known as Team Foundation Server.

Coincidence? I hope not ;)

Processing a Large Flat File Message with BizTalk and the SqlBulkInsert Adapter

Below discusses the second demo for a presentation that I did at the Business Process Integration & Workflow Conference in Redmond during the week of Oct 4.

The title of the presentation was entitled:
Handling Large Messages in BizTalk

The first demo that I did for the presentation can be found here:
Processing a large message and a faster, less CPU intensive splitter pattern.

This entry compares two different methods for processing large flat files in BizTalk.
The size of the flat files being tested ranged in size from: 8MB to 201MB.

At the end of this entry is the download for the SqlBulkInsert Adapter code and the BizTalk Test Project to
perform the tests.

A section of the flat file message being processed is as below. About half of the columns are cut off:

The above is the flat file representation of the XML Production orders message that were used in the first Demo.
Each line of the flat file message contains a production order. The production order data (columns) are separated by commas and each production order line (record) is separated by a Carriage Return/Line Feed.

To process the flat file message, two different implementations were used. In both implementations, the flat file is pre-processed so that it is in a format for further processing -> For example mapping or splitting.


Implementation One: Processing with a Flat File Schema and Custom Receive Pipeline

This implementation uses the standard method of using a XSD Schema with flat file extensions to represent the above flat file message. A custom receive pipeline utilizing a flat file disassembler transforms the flat file into an internal XML message. This XML message will be published into the MessageBox database. If you have not processed a flat file in BizTalk before, try this example in the SDK.

This is just a short recap of how this implementation works and is really for the benefit for those not familiar with BizTalk or who have never processed a flat file in BizTalk.

a) Flat File is picked up by a Receive Location.
b) Receive Location is configured to use a Custom Receive Pipeline
c) Custom Receive Pipeline will convert (disassemble) the flat file, into an internal XML message as below:


d) This XML Message is then published into the MessageBox database
e) Once the XML Message is published into the MessageBox database, any number of Orchestrations or send Ports can subscribe to this message and process this message. This message could then be mapped into another message or split into separate messages as in the first demo.


Implementation Two: Processing with the SqlBulkInsert Adapter

From a performance perspective there are three areas where the above implementation can be improved. This is especially true, if the incoming flat files messages are large greater than 20 MB.
Areas of improvement are listed below:

1) The pipeline processing time.
2) CPU utilization is at 100% for pipeline processing.
3) Flat files when converted into XML files can double or triple in size. For example when a 201MB flat file production order message is converted into an internal XML message, the size of this internal XML message expands to 767MB. This large 767MB message is then published into the MessageBox database. The size of the XML message is greater because of added tags, elements and attributes.

Therefore one approach would be to skip the pipeline processing for the large flat file message. Pipeline processing is a necessary and great feature in BizTalk, but will be skipped in this implementation to increase performance.

A Custom Adapter (SqlBulkInsert Adapter) will be used to pre-process the flat file.
This adapter can only be used with a One-Way Receive Port. The property pages of the Receive Location are as below:


How this implementation works:

a) This adapter behaves similar to the out of the box File Adapter used in a Receive Location. It will poll a directory on a Hard Drive for files to be picked up and processed.
 
b) The FileReceiveFolder Property is set to a directory on a hard drive.
For example -> C:\BTSBulkLoad\Files\ReceiveForSqlBulkLoad
Flat Files dropped into this folder will be picked up to be processed.

c) The FileMask Property is set so the adapter will only pick up only files with the mask.
For example -> *.txt

d) The FileErrorFolder Property will be populated with error files.
For example -> C:\BTSBulkLoad\Files\FileErrorFolder.
If some of the rows(data) in the flat file message are malformed, these will not be processed , but will be placed in a file for viewing. This is analogous to a recoverable interchange. Rows in the flat file that are correct in structure will be processed.

For example, if two of the rows in the flat file cannot be processed (Missing Columns, missing delimiters etc) a file is produced with the row(s) that could not be processed.

11506,10020,TST GF 01 GRD 01,7045936818561,20.25,20.25,58.25,58.250359010995972,3 Inch Core,,,
11665,10020,TST GF 01 GRD 01,36,23584,23584,1432.7045936818561,1432.7045936818561,20.25,20.25,5

Additionally another file is produced with a more detailed explanation of why the row(s) could not be processed as below:

Row 16 File Offset 3822 ErrorFile Offset 0 - HRESULT 0x80004005
Row 25 File Offset 6422 ErrorFile Offset 493 - HRESULT 0x80004005

The Max Number of rows with errors can be configured (explained below). If this number is exceeded, the operation will fail as a whole.

Once a File has been picked up, the SqlBulkInsert adapter will use the below properties to delegate the processing of the flat file to a Sql server stored procedure. The large File will not be submitted to the BizTalk engine to be processed. Therefore the BizTalk pipeline processing will be skipped for the large file.


e) The SqlConnectionString Property is set with the connection string to the sql database that will process the flat file.
For example -> packet size=4096;integrated security=SSPI;data source="(local)";persist security info=False;initial catalog=BTSSqlBulkLoad 

f) The SqlStoredProcedureName Property is set to the name of the stored procedure that will process the flat file.
For example -> LoadInsertProductionOrders

The declaration of the stored procedure is as below:

CREATE Procedure [dbo].[LoadInsertProductionOrders]

@pathAndFileName  varchar(500),
@pathAndErrorFileName varchar(500)


g) The SqlStoredProcFileNameParameter property will set with the name of  the parameter that accepts the file to be processed.
For example -> @pathAndFileName

At run time when the adapter picks up the file, it will generate the following to populate the @pathAndFileName stored procedure parameter:
For example -> C:\BTSBulkLoad\Files\ReceiveForSqlBulkLoad\FlatFileToProcess.txt

Note: Only a pointer to the flat file to be processed is passed to the stored procedure.
Note: UNC paths should also work, but have not actually tried.


h) The SqlStoredProcFileErrorParameter property is set to the name of the parameter in the stored procedure
to indicate the location to where the error files should be placed. See part d)
For example -> @pathAndErrorFileName

At run time when the adapter picks up the file, it will set the @pathAndErrorFileName stored procedure parameter to a Value.
For example -> C:\BTSBulkLoad\Files\FileErrorFolder


At run time when the File Receive portion of the adapter has picked up a file to process, the following code is then called in the adapter to invoke the stored procedure:


internal static string callBulkInsertStoredProcedure
   (string connectionString,
   int commandTimeOut,
   string storedProcedureName,
   string fileName,
   string fileErrorName,
   string fileNameParameterName,
   string fileErrorNameParameterName)
{
 System.Text.StringBuilder sbForReturnedXML = new System.Text.StringBuilder();
 SqlConnection sqlConnect = new SqlConnection();
 sqlConnect.ConnectionString = connectionString;
 SqlCommand sqlCommand = new SqlCommand();
 sqlCommand.CommandType = System.Data.CommandType.StoredProcedure;
 sqlCommand.CommandText = storedProcedureName;
 
 // Dynamically set the name of the parameters that should be called in the custom stored procedure.
 // This is because each stored procedure may have parameters with different names.
 sqlCommand.Parameters.Add(new System.Data.SqlClient.SqlParameter(fileNameParameterName, System.Data.SqlDbType.VarChar,500));
 sqlCommand.Parameters.Add(new System.Data.SqlClient.SqlParameter(fileErrorNameParameterName, System.Data.SqlDbType.VarChar, 500));
 sqlCommand.Parameters[fileNameParameterName].Value = fileName;
 sqlCommand.Parameters[fileErrorNameParameterName].Value = fileErrorName;
   
 sqlCommand.Connection = sqlConnect;
 sqlCommand.CommandTimeout = commandTimeOut;
 System.Xml.XmlReader xmlReader = null;
 sqlConnect.Open();

 try
 {
  xmlReader = sqlCommand.ExecuteXmlReader();
  xmlReader.MoveToContent();
  string xmlProductionOrderIdNode;
    
  while (!xmlReader.EOF)
  {
   xmlProductionOrderIdNode = xmlReader.ReadOuterXml();
   sbForReturnedXML = sbForReturnedXML.Append(xmlProductionOrderIdNode);
  }

  }
  catch (System.Exception ex)
  {
   System.Diagnostics.Debug.WriteLine("Error Occured in Adapter " + ex.Message);
   throw ex;
  }

  return sbForReturnedXML.ToString();
 } 
}

The above is just basic ADO.NET code to call the stored procedure that will return back a small
XML message generated by the stored procedure. This XML message can contain any information. This small XML message will
be published into the MessageBox database. (More about this below).

The stored procedure configured in the receive location is executed. This is where the real processing of the flat file takes place.
Note: This stored procedure could be altered in any way to process the flat file. The below is just an example.

CREATE Procedure [dbo].[LoadInsertProductionOrders]

@pathAndFileName  varchar(500),
@pathAndErrorFileName varchar(500)

-- This stored procedure will take the passed path of the flat file (produced by the the adapter) to process.
-- For example c:\directory\FlatFile.txt.
-- The code in this procedure will then Bulk Insert this flat file into a Sql Server table.
-- This stored procedure was originally created in Sql 2000, but was moved to Sql 2005
-- for the following reasons:

-- a) Sql 2005 has introduced a new argument for Bulk Insert Clause [ [ , ] ERRORFILE = 'file_name' ] 
-- This will place malformed rows into an error file.

-- b) Sql 2005 has introduced Try Catch Blocks in T-Sql. In Sql 2000, if there were any malformed
-- rows in the flat file, an error would be raised to the .Net Helper Component and the operation would cease.
-- With the Try Catch Block in Sql 2005, no error is raised to the .Net Helper Conponent and processing continues.


As

-- Flat File rows will be inserted into this temp table.
-- Note: Could of used a Format File  file, but this temp table matches the structure of the flat file.
-- Eventually, the rows in this temp table will be inserted into a permanent table.

 

CREATE TABLE #tempInsertNewProductionOrders (
 [trk_unit_id] [int] NOT NULL ,
 [pro_product_id] [int] NOT NULL ,
 [actual_grade] [varchar] (80) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
 [actual_basis_weight] [float] NULL ,
 [actual_length_at_turnup] [float] NULL ,
 [actual_length] [float] NULL ,
 [actual_weight_at_turnup] [float] NULL ,
 [actual_weight] [float] NULL ,
 [required_width] [float] NULL ,
 [actual_width] [float] NULL ,
 [required_diameter] [float] NULL ,
 [actual_diameter] [float] NULL ,
 [actual_core] [varchar] (80) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
 [actual_property_1] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
 [actual_property_2] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
 [actual_property_3] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
 [update_timechain] [smallint] NULL ,
 [update_time] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
 [update_user_id] [int] NULL ,
 [position_index] [int] NOT NULL ,
 [comment] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
 [required_length] [float] NULL ,
 [required_weight] [float] NULL ,
 [actual_mfg_grade] [varchar] (80) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
 [actual_moisture] [float] NULL ,
 [actual_caliper] [float] NULL ,
 [actual_colour] [varchar] (80) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
 [actual_finish] [varchar] (80) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
 [set_number] [int] NULL ,
 [position_percent] [int] NOT NULL ,
 [tare_weight] [float] NULL ,
 [user_scale_weight] [float] NULL ,
 [wire_side] [char] (3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
 [trkc_length_adjust_type_id] [int] NULL ,
 [actual_compression] [float] NULL ,
 [actual_hardness] [float] NULL ,
 [sch_prod_order_id] [int] NULL ,
 [trk_set_item_id] [int] NULL ,
 [trk_unit_id_package] [int] NULL )


declare @return int
declare @rowcount int, @error int

declare @sqlStatement varchar(8000)


-- Create the dynamic sql that contains the Bulk Insert Statement to Bulk Load the flat file into the Temp Table
-- An Argument could of been used to control the max number of error rows. As below:
-- MAXERRORS [ = max_errors ]
-- MAXERRORS, Specifies the maximum number of errors that can occur before the bulk copy operation is canceled.
-- Each row that cannot be imported by the bulk copy operation is ignored and counted as one error. If max_errors is not specified, the default is 10.


set @sqlStatement = 'BULK INSERT #tempInsertNewProductionOrders'
set @sqlStatement = @sqlStatement +    ' FROM ' +  ''''+ ltrim(rtrim(@pathAndFileName))  +''''
set @sqlStatement = @sqlStatement +  ' WITH '
set @sqlStatement = @sqlStatement +     '  ( '
set @sqlStatement = @sqlStatement +  ' FIELDTERMINATOR = '  +   '''' + ',' + '''' + ','
set @sqlStatement = @sqlStatement  + ' ROWTERMINATOR = ' +  ''''  +  '\n'   + '''' + ','
set @sqlStatement = @sqlStatement  + ' ERRORFILE = ' +  ''''  +  ltrim(rtrim(@pathAndErrorFileName ))   + ''''
set @sqlStatement = @sqlStatement  +   ')'


-- Temp Table for Sending back results

create table #results (rowsprocessed int,
           ErrorCode int,
           BatchIdentifier varchar(50))

-- Create a unique GUID, that will be stored in each row.
-- This is to differentiate the rows from other batches.
-- This GUID, will be returned back to the Adapter in a small XML message:

declare @newid uniqueidentifier
set @newid = newid()
declare @UniqueIdentifier varchar(50)
set @UniqueIdentifier = replace(convert(varchar(50), @newid),'-','')

 

Begin Try
 -- Excecute the Bulk Insert Statement
 exec (@sqlStatement)

 -- Insert the rows from the temp table into the Permanet Table.
 -- For each row, also set the Batch Guid.
 Insert into InsertNewProductionOrders
 Select  #tempInsertNewProductionOrders.*,@UniqueIdentifier from #tempInsertNewProductionOrders

 Select @rowcount = @@rowcount,@error = @@error 
 Insert into #results values(@rowcount,@error,@UniqueIdentifier)
  
End Try
Begin Catch

    -- Catch any errors and re-raise 

    DECLARE @ErrorMessage NVARCHAR(400);
    DECLARE @ErrorSeverity INT;
    DECLARE @ErrorState INT;


    SELECT @ErrorMessage = ERROR_MESSAGE();
    SELECT @ErrorSeverity = ERROR_SEVERITY();
    SELECT @ErrorState = ERROR_STATE();

    RAISERROR (@ErrorMessage, -- Message text.
               @ErrorSeverity, -- Severity.
               @ErrorState -- State.
               );
End Catch;

-- Send back a small informational XML message to the Adapter.
-- This XML message will be published in the Messagebox Database:

WITH XMLNAMESPACES ( DEFAULT 'http://SqlBulkInsert')
Select rowsprocessed,  ErrorCode,BatchIdentifier,
  (Select Distinct sch_prod_order_id  ProductionOrderId
   from #tempInsertNewProductionOrders  ProductionOrder
   For XML Auto ,Type )
from #results  Results
For xml auto, ROOT('BulkInsertResults')

-- Note with Sql 2005 can:
-- a) Place a Root node around the returned XML
-- b) Specify a Target Namespace.
-- c) Do nesting with an XML Auto Statement.

GO
SET ANSI_NULLS OFF
GO
SET QUOTED_IDENTIFIER OFF

A sample of the XML returned from this stored procedure is as below:

The above XML message is then passed back to the Adapter. This Small message is then published into the MessageBox Database.
Note: When authoring this stored procedure, any information can be returned in this XML message.


Once this small XML message is published into the MessageBox database, any number of orchestrations can subscribe to this message. In this particular case, the main message (Now residing in a table in a Sql Server database) is to be split into separate messages by using the distinct production orders returned back in the above
XML message. A discussion of this pattern can be found here. The productions order records in the Sql Server table could also be transformed (mapped) by using Select Statements with XML Auto and XML Explicit clauses.

Results of Tests

The following hardware was used: Laptop with 2.0 GHz processor.
Windows XP operating System hosting a Windows 2003 VPC image with 1.3 MB of memory allocated.
This VPC image hosted the BizTalk 2006 Server and a Sql Server 2005 instance hosting the BizTalk databases.
The local Sql Server instance also hosted the database where Bulk Insert operation occurred.

Test for processing one Message

Only one flat file message was processed at a time. Only one implementation was tested at a time:
Either the PipeLine (Implementation one) or the SqlBulkInsert Adapter (Impelementation Two).


Results for Pipeline Processing

Size of Flat File Time for Pipeline Processing

XML File Size (Published into the MessageBox)

Number of Rows (Production Orders)  in Message
201 MB 25 Minutes 767 MB 807428
108 MB 13 Minutes 413 MB 435076
56 MB 7 Minutes 213 MB 224130
30 MB 4 Minutes 114 MB 120234
15 MB 3 Minutes 56 MB 62234
8 MB 2 Minutes 31 MB 29123


Results for SqlBulkInsert Adapter

Size of Flat File Time For SqlBulkInsert Adapter Load Number of Rows (Production Orders)  in Message
201 MB 3 Minutes 807428
108 MB 1 Minute 15 Seconds 435076
56 MB 50 Seconds 224130
30 MB 15 Seconds 120234
15 MB 8 Seconds 62234
8 MB 5 Seconds 29123


Discussion of Results

Implementation One: Processing with a Flat File Schema and Custom Receive Pipeline

This is the usual method to process a flat file in BizTalk. Besides disassembling (converting) the flat file into an XML format, receive pipelines perform many other important operations such as:
a) Decoding
b) Disassembling
c) Validation
d) Custom operations using Custom pipeline Components. One example would be Unzipping a file.
etc.

The Good

a) It works!
b) From a design point of view, this is the preferred method as the pipeline can perform a number
of operations on the original message before the final message is published into the messagebox database.
c) Much more complicated flat files can be disassembled in a pipeline. The flat file used in the demo, is simple in structure.
An example of a more complicated delimited and positional flat file is as below, with Header, Detail and Footer Records:

PO1999-10-20
US        Alice Smith         123 Maple Street    Mill Valley    CA 90952
US        Robert Smith        8 Oak Avenue        Old Town       PA 95819
Hurry, my lawn is going wild!
ITEMS,ITEM872-AA|Lawnmower|1|148.95


The Bad

For large Flat File Messages:
a) the operation is somewhat slow (see times above),
b) CPU is pinned the entire time the flat file is being processed in the pipeline.
c) The original flat file message when converted to a XML, more than triples in Size, for example:
201 MB (flat file) transforms to 767 MB (XML). Depending on the operation being performed, a 767 MB message may be unmanageable (for example as a source message in a map).


Implementation Two : Processing with the SqlBulkInsert Adapter

This implementation was discussed in detail above.

The Good

a) Much faster in pre-processing the large messages (see times above).
b) Will not pin the CPU on the BizTalk machine (pipeline processing is skipped)
c) Large XML messages are not published into the messagebox database.
d) There really is not much to this -> A custom adapter and a stored procedure.


The Bad

a) A more complex flat file (see complex flat file message above) cannot be processed with this implementation. This is a limitation of the Bulk Insert Statement.
Other techniques could be investigated to Load structurally more complex flat files into Sql Server including:
i) DTS (Data Transformation Services). Sql 2000 and Sql 2005
ii) Sql Server Integration Services. Sql 2005
b) For the initial message the receive pipeline processing is skipped. If the incoming flat file was encoded or zipped up then the incoming flat file could be first be processed by another receive location/pipeline to decode or unzip, and then routed to a folder where the receive location is configured to use the SqlBulkInsert adapter.
c) The process (orchestration) that subscribes to the small XML message published into the MessageBox, has to be configured to connect to the database where the main message is stored in a relational sql table. This might mean just configuring Solicit - Send Send ports using a Sql Adapter
d) If using a Bulk Insert Statement to load in the flat file, Sql 2005 might have to be used. (See comments in the Stored Procedure) 


Final Thoughts

I have only spent about two hours on the adapter and it is not ready for production (It is a prototype).
I have also tested by processing two large messages simultaneously. Therefore if you are interested in 
using it, download the code and modify or just start from scratch. The code in the adapter is relatively straightforward.

The adapter was originally created with the BizTalk Server 2004 Adapter Wizard and uses the common adapter base classes and interfaces.

This technique (Adapter) could also be used to Bulk load an XML message into Sql Server.
Download the Code HERE. Read the ReadMe.Txt before installing and running.

 

 

 

 

 

The end of Begin/End

While on my way back from the MVP Summit, I was playing around with some of the things I had discovered about VS 2005. I mean, if I was going to drip geekiness for the entire conference, why not extend it to the absolute limit. And what I found was something mildly disturbing.

A pet peeve of mine for a while has been how easy it is to make an asynchronous Web service call. Whether the caller knows it or now, that call actually makes your application multi-threaded. For the simplest case, that’s fine. But when you’re making the call from a Windows form application and need to update the user interface with the results of the Web method call, you are getting into an area of complexity that’s more complicated than you realize. Ask any decent developer what the hardest time of bug to crack is and invariably you’ll get ‘race conditions’, ‘deadlocks’ or just the generic ‘multiple threads’ back as the answer. And you have entered that realm unknowingly.

For VS 2005, Microsoft took my complaint and said ‘hey…if updating the UI is soooooo hard, we should make it easier’. Not what I had in mind, trust me. Ask me about a /nomultithread default compiler switch and you’ll be closer. So now instead of calling BeginHelloWorld and EndHelloWorld, it is possible to call HelloWorldAsync and create an event handler for HelloWorldCompleted. Yes, it’s a familiar model. And, as it turns out, HelloWorldCompleted actually fires on the main UI thread, so that ugly MethodInvoker stuff no longer needs to be understood.

But along with changing this asynchronous model, Microsoft also eliminated the Begin/End versions of the methods. Which means if you need to do something like a fork/join pattern, you can’t. Just so that we’re clear, a fork/join would have previously consisted of:

  IAsyncResult ar = ws.BeginHelloWorld(null, null);

  // Do some stuff in parallel with the web service call

  string result = ws.EndHelloWorld(ar);

Because of how EndHelloWorld works, the main thread will block until the background thread has completed. And if the background thread finishes first, processing will continue normally. This is a very common approach if you’re web page were to fire off an asynchronous call to a Web service and wanted to include the results on a page being returned to the client.

As I was saying, since the Begin/End methods are no longer included in the proxy class, this pattern (at least as I have outlined it) is no longer possible. And to replicate the pattern using the event handler model is non-trivial. So my solution was to take the class generated by WSDL (or by adding a Web reference) and putting back in the Begin/End methods that I need. The basic structure is:

  public IAsyncResult BeginHelloWorld(AsyncCallback callback, object asyncState)
  {
   return this.BeginInvoke("HelloWorld", new object[0], callback, asyncState);
  }

  public string EndHelloWorld(IAsyncResult ar)
  {
   object[] results = this.EndInvoke(ar);
   return ((string)(results[0]));
  }

The second parameter on the BeginInvoke call is an array of objects representing the parameters to HelloWorld, so if you’re doing this for a method that takes parameters, you’ll need to change the dimension of the object array and initialize it appropriately.

 

Microsoft LINQ Resources for September 20th

Community Thoughts

  • Werner Moise has been “reflecting” on LINQ and the more time he spends, the more he's realizing “how well thought out and practical it is”. He has some excellent points.
  • Dinesh wants to know how you feel about attributed-based O/R mapping vs. external XML Files. He also talks about the connected vs. disconnected nature of DLinq. Also, here is the code from his DLinq demo at PDC.
  • Jomo Fisher shows how to create a custom aggregate function in LINQ using extension methods. Sweet, but it don't work for DLinq, but maybe then you want to look at User Defined Types and Aggregates in Yukon/SQL Server 2005. Oh man, this is going to take some architectural distillation.
  • Jon Galloway is scared by code maintainability problems introduced by DLinq. Jon and I share a common PowerBuilder background, where you could put SQL right inside your PowerScript (not unlike Progress either). This is not really a DLinq problem, but a code separation, high cohesion-loose coupling problem. The answer back in the PowerBuilder day was to put all your data access in datawindows. The .NET answer today would be componentized DataAdapters or DataReaders, but we still see the bad practice of people intermingling data access code in other areas of their application. But perhaps maybe DLinq does make this a slippery slope. Maybe we should all just stick to TableAdapters - yikes!

Resources

Videos

Audio

Newsgroups:

Articles

What am I missing? Drop me a line on this blog. I'm heading out next week to a VB Software Design Review and the MVP Summit in Redmond, just because there isn't enough new technology in my life these days.