A Sample Program For Moving DB2 Changed Data from WebSphere... Introduction This article introduces a sample program that retrieves DB2 changed...

A Sample Program For Moving DB2 Changed Data from WebSphere MQ to Files with Q Replication
Introduction
This article introduces a sample program that retrieves DB2 changed data in a delimited values format from
WebSphere MQ and saves it to files. The sample is based on the event publishing function found in the IBM
InfoSphere Data Replication (IIDR) Q Replication technology.
Event Publishing captures changed data from a database transaction log and makes it available for applications
to consume. The consuming applications may range from in-house programs to off-the-shelf ETL tools, such as
InfoSphere DataStage, that perform complex transformations on the data. This process is illustrated in the
diagram below:
Q Replication offers event publishing to WebSphere MQ and to tables called CCDs. Some applications are not
able to consume from these directly They need another interface to changed data.
Files are often a good option. You can move changed data from WebSphere MQ to files with a simple event
consumer. The sample discussed in this paper is an event consumer. It shows how more applications can
consume changed data from Q Replication.
Even though the article shows you how to invoke the sample program for DB2 changed data, its contents also
apply to changed data from Oracle.
The Solution at a Glance
The three stages of the solution are shown in the diagram below:
In the first stage, you identified the tables for which changed data is to be captured. The event publishing
function puts committed transactions into a queue. WebSphere MQ sends the data to the system where you
need it.
In the second stage, the sample program, called asnmsg2file, reads the queue and writes data to the local file
system. One file is created for each table. The program reads data from the queue for a user-defined amount
of time. For example, 5-10 minutes. At the end of that time, the program shuts down so that the files can be
processed.
In the final stage, the consuming application reads the files. The data is in a delimited values format. The
application or a script deletes the files once the application is finished with them. The asnmsg2file program can
now be run again.
System 1 and system 2 in the picture above are logical servers that can reside on the same physical server. This
is the case for the exercise in Part 3.
How This Article is Organized
The article starts with a description of the sample program prerequisites then moves on to file format and
processing, restrictions, program logic and input parameters.
The second part of the article includes instructions to install and configure the program to run on your system.
The third part of the article ends with the tasks and commands necessary to run a simple end-to-end scenario.
Prerequisites
The sample program runs on Windows, UNIX (including Unix System Services USS) or Linux. Before you run it, a
Q Replication environment must be up and running for a DB2 or InfoSphere Warehouse. The Q Replication
environment must be defined as follows:
•
Provide a WebSphere MQ local queue on the system where you plan to run the asnmsg2file program. This
queue will hold the incoming data. You also need the supporting MQ definitions to connect the source
system to the target system. If you are not familiar with WebSphere MQ, you can set up a typical MQ
configuration for event publishing using the instructions found in A Fast Way to Get Started with
WebSphere MQ for DB2.
•
License Q Replication for your DB2.
◦
On Windows, UNIX or Linux, the Q Capture program in DB2 ESE and DB2 Workgroup will not start
unless it has been licensed through one of the following products:
◦
◦
◦
◦
The IBM Homogeneous Replication Feature for DB2
InfoSphere Data Replication
InfoSphere Replication Server
On z/OS, the Q Capture program is licensed though one of the following products:
◦ InfoSphere Data Replication
◦ InfoSphere Replication Server
Processing Files Created by the Sample Program
For each run, the program creates a directory whose name is a timestamp. The directory is created as a
subdirectory of a main directory provided through an input parameter. Each run's data files go into this
subdirectory. One data file is created for each table with changed data.
Consuming applications look in the main directory for the presence of subdirectories to determine whether the
asnmsg2file program ran, and how many times, since the last time the consuming application was run. If new
subdirectories are found, the consuming application then looks for data files in each subdirectory.
The consuming application should delete subdirectories after successfully processing their data files.
The output format of each data file follows closely the Data Event Publisher delimited message format.
You can not change the output format of the data file but you can provide your own output delimiter (using an
optional parameter). For example, you won't be able to change date or timestamp format.
The picture below idisplays the contents of the EventPublisherFiles directory, the directory provided to the
asnmsg2file program for the scenario described in part 2. As you can see:
◦
The 1329960391105 directory was created when asnmsg2file first started and holds data for that run
◦
Three files were created corresponding to three tables found with changed data in the MQ queue
when the program was running:
Restrictions and Error Messages
The asnmsg2file program does not currently perform comprehensive error checking so review the configuration
attributes and prerequisites to avoid unnecessary exceptions. If any unexpected error happens after the
message has been read from the queue, the data is available in the data files so data read from the MQ queue
is not lost.
There are two restrictions
1) The program currently only runs on a system where a WebSphere MQ queue manager is running. It is a JMS
application that uses JMS binding mode connections (not client mode). What this means is that it won't be able
to access a remote MQ queue manager.
2) The program currently only supports the comma ',' as the decimal delimiter.
The asnmsg2file Program Logic
The sample program runs as a batch job. You decide how often and how long the program runs. A cron job
may be helpful to schedule these runs automatically.
Once started, the sample program does the following:
•
Verifies input parameters
Three required input are the MQ queue manager name, the MQ publishing queue name and the output
directory where all files are created. One of the optional parameter is the duration of the run with a
default of five minutes
•
Connects to the queue manager
•
Creates a directory with the current timestamp. This is created as a subdirectory of the one provided as an
input parameter.
•
Listens to the queue and process messages for a user-specified amount of time. By default, this is five
minutes, a good choice for many situations.
•
For each message found in the queue
◦
Parses the message and finds the table name(s) associated with the changed data. The MQ message
may include changed data for multiple tables.
◦
Processes the message to format the output.
◦ Writes the data to files in the directory with the current timestamp. The program uses the file
naming convention tablename.dat (the file name does not include the table's schema) and creates the
file if it doesn't already exist.
Note: if a large transaction spans across multiple MQ messages, the data files are first created in a
“stage” directory. After the transaction is processed in its entirety, the sample moves the contents of
the stage directory to the timestamp directory.
•
When the run time has been completed, disconnects from the queue manager and stop.
The asnmsg2file Parameters:
The three required parameters are:
-qmgr: WebSphere MQ queue manager name that manages the local queue to be read by the sample
program. The queue manager must be running before invoking asnmsg2file otherwise you'll receive a
JMS exception (queue manager not started)
-queue : Name of the local queue. If you want to process messages from a different queue, you will
need to start another invocation of the program
-out: Directory where timestamped subdirectories (and their data files) are created. If it doesn't exist
prior to invoking asnmsg2file, it will be created. That value should remain the same across scheduled
invocations of asnmsg2file if the stage directory is populated with data files across the invocations.
You will need to invoke the program with additional (optional) input parameters if you:
•
•
•
want to run the program for a different amount of time than the provided defaults (5mn)
made some updates to the default Q Replication publication queue map properties
want to change the output file structure (changed data record delimiter or column delimiter).
The six additional optional parameters are:
-time: Length of time the program should run, in milliseconds. Default is slightly less than 5 minutes
(28500 ms)
-outRecord: Output changed data record separator. Default is 0A (ASCII new line)
-outCol: Output column delimiter. Default is 1E (ASCII record separator)
-inCol: Input column delimiter. Default is 2C (ASCII comma), same as Data Event Publisher Hex ASCII
value column delimiter default
-inString: Input character string delimiter. Default is 22 (ASCII double quotation mark), same as Data
Event Publisher Hex ASCII value char string delimiter default
-inRecord: Input changed data record separator. Default is 0A (ASCII new line), same as Data Event
Publisher Hex ASCII value changed data delimiter default
Part 2: Installing and Configuring asnmsg2file to Run on Your System
To run asnmsg2file in your environment, complete the following tasks:
•
•
Launch a new DB2 Command Window and create a new directory where the program will run.
◦
If you are running Windows 7, launch the new command window as Administrator to prevent any
authorization errors running Websphere MQ.
◦
For the exercise of Part 3, I used 'mymsg2f'.
Switch to the 'mymsg2f' directory and unzip the asnmsg2file.zip file (in the Samples section of the Q+SQL
Replication Forum on developerWorks) by running the command: unzip asnmsg2file.zip.
You should see six files in your directory. Let's call this current directory 'mymsg2f':
•
◦
EPcreatedb.bat: DB2 script to create a sample database
◦
EPmq.asnclp: ASNCLP script to create WebSphere MQ objects
◦
EPpub.asnclp: ASNCLP script to create event publishing objects
◦
asnmsg2file.jar: sample event consumer program to save messages to files
◦
asnmsg2file.java: sample source code
◦
README: a subset of this article describing the asnmsg2file event consumer program
Switch to the 'mymsg2f' directory and run the following command to update your CLASSPATH variable and
avoid a 'NoClassDefFoundError' exception:
•
◦
Windows systems: set CLASSPATH=%CLASSPATH%;C:\mymsg2f\asnmsg2file.jar
◦
UNIX/Linux systems: exportCLASSPATH=$CLASSPATH:/u/.. /mymsg2f/asnmsg2file.jar
Verify that the WebSphere MQ JMS jar files already exist in your CLASSPATH (part of the WebSphere MQ
JMS install):
◦
Windows systems: ..\Websphere MQ\Java\lib\com.ibm.mqjms.jar and ..\Websphere
MQ\Java\lib\jms.jar
◦
UNIX/Linux systems: ../lpp/mqm/java/lib/com.ibm.mqjms.jar and ../lpp/mqm/java/lib/jms.jar
If they don't, you need to include them in your CLASSPATH using the method listed above.
•
The asnmsg2file.jar file is compatible with JDK 6 VM. If you run in a pre-JDK 6 VM environment, compile
the asnmsg2file.java source to create a new jar file by running the two commands below:
◦
javac -verbose asnmsg2file.java
◦
jar cvf ansmsg2file.jar asnmsg2file.class asnmsg2file$MQListener.class asnmsg2file$StopListen.class
You're now ready to run your scenario.
Part 3: Running a Simple Scenario
For the simple scenario included below, I used a Windows 7 system with a copy of WebSphere MQ V7 and a
copy of the DB2 product at version 9.7 FP4.
This section takes about 20 minutes to complete. It shows you how to configure three publications and run the
sample program to create three data files with the changed data. The diagram summarizes the configuration:
The scenario includes four steps:
1. Set up directories
2. Create a DB2 sample database PUBSRC and publication definitions for three of its tables
3. Start Q Capture (asnqcap) on your system
4. Update the DB2 source system and run asnmsg2file to review the output files
1.
Set Up Directories
Before running the scenario, complete the two tasks below:
2.
◦
Create a new directory for the output directory where all sub-directories and data files will be created.
I used 'DEPFiles' which I created in the 'C:\mymsg2f\' directory.
◦
Launch another Command Window, switch to 'mymsg2f' and create another new directory: 'myQCap'
Create the sample database and publication definitions
Run the four commands below all from 'mymsg2f' directory:
◦ createdb.bat
The command creates a sample database called PUBSRC
◦ asnclp -f EPmq.asnclp
The command creates a shell and a batch file, for UNIX/Linux and Windows systems respectively
◦ qrepl.pubsrc.mq_windows.bat (windows system) or qrep.pubsrc.mq_aixlinux.sh (UNIX/Linux system)
The command creates a new queue manager PUBSRC, starts it and creates all necessary queue objects
for Data Event Publishing. The publishing queue is called ASN.PUBSRC.DATA.
◦ asnclp -f EPpub.asnclp
The command creates the publications and their related objects for three source tables: EMPLOYEE,
DEPARTMENT and PROJECT.
3.
Start Q Replication's Q Capture program.
Run the command below from 'myQCap' directory:
◦
asnqcap capture_server=PUBSRC
You should see initialization messages for the EMPLOYEE, DEPARTMENT and PROJECT tables.
4.
Update the DB2 Source System and Run asnmsg2file
Run the five DB2 statements below all from 'mymsg2f' directory:
◦
db2 "update EMPLOYEE set job = 'MANAGER' where lastname = 'QUINTANA' ";
db2 "insert into DEPARTMENT values ('A01', 'TEST', null,'A00',null)”;
db2 "insert into DEPARTMENT values ('A02', 'TEST2', null,'A00',null)”;
db2 "insert into DEPARTMENT values ('A03', 'TEST3', null,'A00',null)”;
db2 "update PROJECT set prendate = CURRENT DATE where projname = 'OPERATION SUPPORT' ";
You should now see five new messages in the ASN.PUBSRC.DATA queue.
Finally, run the Java program from 'mymsg2f' directory to process the messages and save them to files:
◦
◦
java asnmsg2file -qmgr PUBSRC -queue ASN.PUBSRC.DATA -time 60000 -out DEPFiles
▪
I'm using a one-minute run here so you don't have to wait long to see the results
▪
You'll want to add optional parameters as discussed earlier and/or vary the db2
insert/update/delete statements to fully try this new program out.
You should have three files in a directory created under the DEPFiles output directory, as follows:
Stay Tuned
Expect an upcoming article and blog post with a real-world use case for this sample.