Reprinted with Permission by Quest Software Dec. 2005


DDF – Mainframe DB2’s Window to the World

Robert Catterall

When I first started working with DB2 for mainframe servers, Version 1.2 was the generally available product. At that time (the late 1980s), DB2 did a great job of storing and managing data, but couldn’t serve it up without help from a program running in an “extra-DB2” (i.e., outside of DB2) address space, such as a CICS or IMS transaction, or a batch job. Things changed a few years later with the introduction, in DB2 Version 2.2, of the Distributed Data Facility, or DDF. In this article, I’ll share with you some DDF information and observations, with my own knowledge of the subject having been augmented through a recent conversation with Curt Cotner, IBM Fellow and the leader of the team that took the DB2 DDF from concept to reality.

Original Aim and Subsequent Development

When the Distributed Data Facility made its debut, the development team was thinking primarily about data federation – about providing a means whereby organizations using mainframe DB2 (keep in mind that DB2 for Unix and PC platforms did not arrive until 1993, and initially ran only with OS/2 and AIX) could “virtualize” the actual location of tables in different databases on different servers. The initial DB2-to-DB2 communication mechanism provided through DDF was called private protocol. It allowed a program local to the DB2 subsystem on mainframe A to access a DB2 table on mainframe B by way of a three-part name (location.owner.table-name). A DBA could define an alias on the local DB2 subsystem and associate it with the fully-qualified three-part name of the DB2 table on mainframe B, thereby making the remote table appear to local programs as a table in the local DB2 database; thus, the DDF made it very easy to tie together objects spread across multiple (and perhaps geographically dispersed) mainframe DB2 subsystems.

A number of organizations jumped on the new capability. I recall seeing a presentation at a SHARE user group conference in the early 1990s, in which a DBA from a large telecommunications company explained his firm’s use of DDF to move data between multiple corporate data centers.

With the very next release of DB2 for mainframe servers (Version 2.3), IBM delivered a new protocol for distributed DB2 communications, called Distributed Relational Database Architecture, or DRDA. While DRDA was initially not quite as easy to use as private protocol (one had to explicitly connect to a remote DB2 server, as opposed to using a three-part table name or an alias defined thereon) it provided two important advantages. First, DRDA can be used to access DB2 data on any platform (private protocol only works between mainframe DB2 servers). Second, with DRDA, static SQL statements are executed as such at the DB2 server location (when private protocol is used, even static SQL statements embedded in a requester-side application program are dynamically executed at the remote DB2 server, resulting in increased CPU consumption on that end).

Of course, to realize the performance benefit of static SQL, you have to have a package at the data serving DB2 location. Because the static SQL statements are embedded in the requester-side program unless, stored procedure technology is utilized, you need a package there, too. How do you address that need? Easily, by binding the application program twice: once to create a package on the requester side (where the program will actually be run) and once to generate a package on the server side (where the SQL statements will be executed). Both of these BIND commands would be issued on the requester side, once with location name not specified for the package (the default is the local DBMS), and once with a fully-qualified package name beginning with the location name of the remote DB2 server.

Version 3.1 of DB2 for mainframes offered an enhanced DRDA which provided two-phase commit capability for distributed transactions; thus, an application program utilizing DRDA could change DB2 data (via INSERT, UPDATE, or DELETE) at multiple locations in one unit of work, and all of the changes would be committed (or all would be rolled back if the unit of work did not complete successfully).

The Missing Piece – Delivered

To the improved CPU efficiency offered by DRDA versus private protocol, DB2 for OS/390 Version 4 added another benefit: the ability to call a stored procedure at a remote DB2 server (private protocol can be used only for data manipulation SQL statements such as SELECT and UPDATE).

Still, DRDA had that one nagging shortcoming, relative to private protocol, which complicated certain application development efforts: using DRDA, one could not mask the actual location of a DB2 database object. Location abstraction was not possible because DRDA required that an explicit CONNECT TO location-name statement be issued before objects at the remote DB2 server could be accessed by the application program; thus, if a DB2 object were moved from one location to another, a code change was necessitated (though the use of a host variable in the CONNECT TO statement could somewhat reduce the hassle factor of such a code change). The same problem came up when a program was moved from, say, a development environment to a production environment).

Happily, this irritant was removed when DB2 for OS/390 Version 6 delivered support for DRDA access to remote DB2 objects by way of three-part names. When combined with locally-defined DB2 aliases, this is a powerful capability. Suppose, for example, that a table called DAILY_SALES, with schema name REGION2, is created in a DB2 subsystem with a location name of LONDON. At a different DB2 location (SYDNEY, let us say), a DBA can create an alias that resolves to the fully-qualified table name LONDON.REGION2.DAILY_SALES, thereby making the remote DB2 object appear as local to programs running on the SYDNEY system. DRDA can be used to access data in the remote table, and there is absolutely no need for a developer to concern himself or herself with the actual location of the physical table.

Given this location abstraction capability, there is no reason to use DB2 private protocol instead of DRDA. If you are using private protocol today, SWITCH TO DRDA. At some point in time, you will have to do this, because private protocol will eventually be eliminated. Why focus on an incentive of the “stick” variety, however, when you have several “carrot-type” incentives to go with DRDA: the aforementioned static SQL support and the ability to call stored procedures on remote DB2 servers, plus the ability to access large objects, or LOBs, at remote locations (more and more sites are storing LOBs in DB2 databases – CheckFree, the company for I which I work, being one of them). Is it hard to switch from private protocol to DRDA? No. Generally speaking, it’s a matter of rebinding programs that currently use private protocol.

While on the subject of DRDA versus private protocol, I would like to clear up what for some people is a point of confusion. There are those who feel that they need to stay with DB2 private protocol because they need “system-directed access” to remote DB2 objects, and they believe that DRDA requires “application-directed access.” In other words, these people believe that a DRDA-using application program will require extra code to direct access to the appropriate DB2 server location. This is not true. I suspect that the confusion stems from terminology used in some DB2 monitor reports and online displays to distinguish DB2 private protocol activity from DRDA activity. The former was termed “system-directed access” while the latter was labeled “application-directed access.” If what you want is system-directed access to remote DB2 database objects, and what you mean by that is the ability to access the remote objects as though they were local, from a program coding perspective, you can get that with DRDA (see paragraph above that describes how aliases can be used for this purpose).

The Client-Server Viewpoint

I previously mentioned that the DB2 DDF was originally developed with database federation in mind. Not long after its appearance on the scene, however, the DDF came to be used primarily as a means of accessing mainframe DB2 data from non-mainframe clients (e.g., Linux, Unix, and Windows clients). For this, IBM provided the DB2 Connect product. Although DB2 Connect is available in a Personal Edition that can be installed on one’s PC, the more common configuration in an enterprise setting involves the use of a DB2 Connect Enterprise Edition gateway server. Client-side programs access the DB2 Connect gateway, and through that the mainframe DB2 server, by way of a thin piece of code called the DB2 Runtime Client (originally called the Client Application Enabler, or CAE). Interaction with a mainframe DB2 subsystem is made easy for Java and .NET application programmers through the JDBC (Java Database Connectivity) and ADO.NET (ADO is short for Active Data Objects) drivers that come with DB2 Connect. Note that while the IBM JDBC type 4 driver does not technically require the presence of a DB2 Connect gateway server, one has to license DB2 Connect to use this driver for mainframe DB2 access (at CheckFree, we prefer to use the Type 2 JDBC driver because it offers better performance and the advanced connection management capabilities of the DB2 Connect gateway server).

Can DB2 for z/OS be a client, accessing DB2 databases on non-mainframe servers via the DDF? Of course it can (DB2 Connect is not needed for such access). In fact, through IBM’s relatively new DB2 Information Integrator product (also known as DB2 II), a DB2 for z/OS-connected client program (such as a CICS transaction or a batch job) can use the DDF to access data in non-mainframe, non-DB2 databases as though it were stored locally. DB2 II does this by making the remote non-DB2 database appear to DB2 for z/OS as a remote DRDA-compliant data server. From there, it is easy to create aliases, as described a few paragraphs ago, that make the objects in the remote database appear as local DB2 objects to mainframe programs.

As you can clearly see, the DDF, DB2’s window on the world, is getting more open all the time.
 


Robert Catterall is a database technology strategist at Atlanta-based CheckFree Corp. You can reach him at rcatterall@checkfree.com.