Sunday, January 27, 2008

Processing Inter-dependent files using a Non-Uniform Sequential Convoy

The Problem

A re-occurring topic that I have recently come across on some of the BizTalk forums(here and here) is the ability to process interdependent files. More specifically we want the ability to process a particular file after receiving a "trigger" or "signal" file.

The use of trigger and signal files is very common in SAP and mainframe systems. A common pattern in the SAP world is to write "work" files to a "work" folder and a "signal" file to a "sig" folder. For SAP outbound files, the SAP system will write data files to the work folder, once the file has been completely written, a signal file will be written to the sig folder. This sig file gives the Middleware, or downstream system, an indication that this file is safe for processing. This is important as some files that get written from these types of systems are very large or may be written to over a period of time via batch jobs.

BizTalk supports messaging patterns that provide the ability to wait for messages to arrive before completing a business process(or orchestration). These mechanisms tend to use correlation and the use of convoys. Stephen W Thomas has written a whitepaper that dives into these topics further.

The messaging pattern that I have chosen to aid in this solving this challenge is the Non-Uniform Sequential Convoy. This pattern's mandate is to process 2 or more messages in a known order.

The challenge with this pattern, for our scenario, is that our work file is written prior to the sig file, but in terms of BizTalk processing messages we want BizTalk to only process the work file once the sig file has been written.

In order to solve this problem we are going to leverage a .Net Helper class to do some of the lifting.

Solution
I have written a proof of concept(POC) app to demonstrate the pattern. The solution is pretty light and easy to implement. It contains 2 message schemas(1 for the signal file, 1 for the work file), a property schema, an orchestration and a .Net Helper class. Note, that I have left what you do with both files once you get them into the same instance of the orchestration out of scope. At this point your requirements will determine what you need to do with both files.

The first artifact that will dive into is the Signal file. The file itself does not need to contain a lot of data. For my sample I have two elements, a timestamp and a WorkFileName. In order to use my convoy, I need to create a correlation type and correlation set. A correlation type requires a promoted property(or BizTalk System property) in order to "link" multiple messages to one running instance of an orchestration. For more information on Correlation, please see the following document.



The second artifact represents data that could be generated by an upstream system such as SAP or a Mainframe. This document is entirely fictitious so don't look too much into it. Also note that I have a promoted property called FileName that is used in my Correlation Type.

So for this example I am using FileName but you can correlate based upon any data, as long as your signal file and work file promote the same data values.




Below is a snapshot of what our orchestration looks like.

A few things to note is that Non-Uniform Sequential convoys to require that both receive shapes connect to the same receive port. This logical receive port also needs to be marked for Ordered Delivery.




For the initial receive shape we need to set a few properties. Since this is the first Receive shape we need to set the Activate property to True in order to instantiate the orchestration.

The other property that we need to populate is the Initializing Correlation Sets. In order for BizTalk to "wait" for the work file to be picked up by the same instance of the orchestration that consumed the signal file we need to initialize a correlation set. (Keep reading for more info on how to create the Correlation Set).


Prior to creating a Correlation Set, you need to create a Correlation Type. You can do this from the Orchestration View tab.

We want to create the Correlation Type based upon the element/attribute that we promoted in each of the two schemas.
We then need to create a Correlation Set, which is instantiated by the initial Receive Shape, that is based upon the Correlation Type that was just created. Correlation Sets are also configured in the Orchestration View tab.



In the Rename Work File Expression shape we are going to call a .Net Helper class that will aid in renaming work file in the source folder. This step is a critical step in the process. Since the work file is completely written before the sig file is, we cannot use the original file extension in the Receive Location file mask. Otherwise, this would prompt BizTalk to consume the work file prior to us wanting it to be consumed. By appending a temporary extension, like .BIZ, to the end of the work file name, we can be assured that BizTalk will pick the file up when we want it to. So in the Receive Location we will use a *.BIZ extension instead of a *.XML.

Note that I have hard coded the path of the source location. Since this is a POC, this is ok but this would not be a suitable solution for a production environment


In order for BizTalk to "wait" for the work file to be picked up, we need to set the Following Correlation Sets property with the same Correlation Set that we initialized in the first Receive shape. Since the rename operation occurs the step before, the "wait" time will be extremely small. The main point here is that we want to control when BizTalk picks up the work file. It is essentially the rename operation and the second receive shape/location that controls this.
Now that you have consumed both the signal file and work file, in order, you can finish up any processing that is required by your business requirements. Since this is just a POC, I output the work file to a folder.


Testing
In order to simulate how an upstream system would write the files, I drop a file into the work folder with the original extension. I then drop a sig file into the sig folder. The sig file will get picked up and BizTalk, via the .Net Helper, will rename the work file. The receive location, for work files, will then pick up the work file and the orchestration will finish processing both files.


Conclusion
Through the use of a Messaging Pattern and with the help of a .Net Helper class we can process a set of files in a known order.

Saturday, January 12, 2008

TopXML Oracle Adapter Dynamic "like" Send Port

A colleague and I recently ran into a situation where we needed to determine the Oracle Connection string at runtime. This may sound like an odd requirement, but the 3rd party application that populates this database has been configured for redundancy in an a-typical fashion. It has not been set up in a fashion like a SQL Cluster...that would be too easy :-). So instead of letting an infrastructure component(like Windows Clustering) determine which instance is the "Primary", their software performs this function.

I have set up a POC that will walk you through this configuration.

The first thing that you are going to want to do is add a reference to your project. The DLL that you want to include is "BTSUtils.Adapters.Databases.Properties.dll". You can find this file in your \BizTalk Utilities\Database Adapter\Bin.

This will allow you to update the Connectionstring context property.

Below is an image of my orchestration. The orchestration doesn't do anything all that spectacular. The majority of the orchestration is for demonstration purposes.




Basically, a file is passed in that has data which will be used as part of the Oracle Request. So a simple map will take the value from the incoming document and map it to the Oracle Request.

Once, we have the Oracle Request document, we need to update the context property as part of the Message Assignment Shape. I have updated the
BTSUtils.Adapters.Databases.Properties.Transmit.Connectionstring and assigned a new value.




So obviously hard coding the connection string in your orchestration is not a good idea, but remember this is just some POC type code. In our requirements, we need to be able to connect to one of two data sources, but have to determine this at run time. Some options for storing these connections strings may be the BTSNTSvc.exe.config(clear text warning!) or in the SSODB(secure) and extracting the values via the SSO API.

Once this property is set, I submit the Oracle Request message and wait for a Response. The Expression shape in my orchestration, is simply a trace statement. I am interested in knowing how may nodes are returned as a result of my query.

After this log point, I simply drop the results of the query to a folder.

So if I modify the connection string inside of the Orchestration, then what does the Send Port configuration look like?

Interestingly enough, you do not have to populate the "Connection String" property with anything meaningful what so ever. You do need to provide it with a value in order to successfully complete the configuration though. I have decided to populate this property with the associated Macro. This way an Ops guy can look at it and figure out that the value is populated elsewhere.

Microsoft MVP Awarded: Windows Server System - BizTalk Server

I was recently informed that I have received MVP status from Microsoft. It is truly an honor to be recognized by Microsoft for this award. I look forward to the MVP experience and meeting other MVPs at Microsoft events such as the MVP Global Summit.

If you would like to see my MVP profile, you may see it here.