Archive for the ‘DB2 “SQL Server” “Linked server”’ Tag

DB2 V9.7 on RHEL 5 as a data source for a Linked Server in SQL Server 2012

(Aside – it’s been far too long since I last posted anything here. As I’m doing more R&D at the moment I’ll try to post a bit more).

I recently wanted to connect DB2 V9.7 running on RHEL 5 to SQL Server 2012 using SQL Server’s linked server technology. At one time I used Linked Servers to connect to a wide range of data sources, but I hadn’t used them for a while. Thinking the world would have moved on a bit, I looked around the Web to see what the current thinking was but largely drew a blank. The only postings I could find seemed to be from people who described connecting to DB2 as difficult but they didn’t elaborate.

I’d be the first to admit that the process is not straightforward, but it’s not made easier by a lot of what is posted on the Web, which is often misleading at best or just wrong. Most references detail connecting to DB2 running on iSeries or z/OS via DB2 Connect. For DB2 running on Linux or Windows you don’t use DB2 Connect, you connect directly to the DB2 instance so the process is a little different.

I will assume the reader is familiar with Linked Servers in general. If not, have a play. I would suggest you start with something simple, like SQL Server talking to SQL Server. This combination is simple to set up and is reliable. I will also assume you know enough about DB2 to be able to catalog a node and database on the Windows server.

To connect to DB2, there are three OLE DB providers that you can use without paying for third party solutions. These are:

The Microsoft OLEDB Provider for DB2.

The IBM OLEDB Provider for DB2.

The Microsoft OLEDB Provider for ODBC using an ODBC connection to DB2.

From painful experience, if you want to use a third party OLEDB provider, insist on getting a free trial before paying for it. I recall some were no better, and indeed were sometimes worse, than the free versions.

The Microsoft OLEDB Provider for DB2 (DB2OLEDB).

This is installed from the SQL Server Feature Pack. It’s not part of a normal SQL Server installation. Quite simply, I can not get this to work. The installation looks OK, but I can not work out a combination of parameters to get a connection from SQL Server to DB2. The examples I can find all look like they are trying to talk to DB2 Connect. Possibly that’s all this Provider can do?

The IBM OLEDB Provider for DB2 (IBMDADB2).

This appears to work. It does need some effort to set it up but it is probably your best option.

Begin by installing the DB2 Runtime Client (RTC) on the server running SQL Server. The Runtime Client is the minimum DB2 client and just includes the network libraries and the basic command line interface. If you’re looking for the download, at the moment, it seems to be called “IBM Data Server Runtime Client for InfoSphere Federation Server V9.7 Multiplatform Multilingual eAssembly”. I promise I’m not making that up, I couldn’t if I’d tried.

Once the RTC is installed, catalog the source DB2 node and database. As the RTC has no graphical interface, you’ll need to use the db2 command line processor. Make sure this is working before you go any further. Be sure you can connect to your database and can select some data back. Both the OLE DB provider and, if you use it, the ODBC connections use your local database directory. Of course what this means is that both the OLE DB provider and the ODBC driver are just thin layers on top of the DB2 connection layer. This is good news as the DB2 connection via the RTC is reliable.

Next, start up SQL Server Management Studio and you should see in the Linked Servers/Providers list IBMDADB2.<your SQL Server instance>.

This next bit is vital – right click the provider and open the properties list. Make sure the “Allow inprocess” is ticked. Not only will this not work if you don’t have that ticked, it will actually crash the SQL Server service when you try to use the provider.

(Just as an aside, if you have SQL Server in a clustered environment, this would be a really good way to test fail over. Finding relatively harmless ways of crashing a database instance to test fail over isn’t always easy.)

Now you create your Linked Server. This is very simple because all the connectivity details are encapsulated in the DB2 database catalog entry you created earlier. All you need to specify for the linked server is

  • The Provider (from the drop down list).
  • Your database name as the Data Source.
  • Some user credentials in the Linked Server Security tab.

To check all is well, click on the linked server, expand Catalog/default/Tables/System Tables and you should see the DB2 catalog tables for your chosen database. In case you’re not so familiar with the DB2 system catalog, there’s no Master database in DB2. Each database has its own set of catalog tables and is completely isolated from the other databases in the instance (think SQL Server 2012 Contained databases).

Queries to this form of Linked Server use a 4 part name in the same way that Linked Servers to SQL Server do. For example:

select * from <Linked Server>.<database>.SYSCAT.TABLES

Be careful with the object name case. Object names are case sensitive. I know in the native DB2 client object names are case insensitive, but through a linked server they are. If you want to avoid 4 part names, create a local view and access the table through that.

So this works for me. As always with linked servers, if you’re planning on using them a lot, test them to death. In particular test the full range of data types that you want to process. Data type translations in Linked Servers are notoriously unreliable, often caused by poor metadata from the OLE DB Provider.

Performance

The good news about this type of Linked Server is, when joining tables on the DB2 instance, you can see SQL Server pushing the join down to DB2. The easiest way to test this is to check what SQL statements are being executed in the DB2 instance. I’d recommend doing this on a quiet system so you can pick out your statement easily.

On the DB2 Instance, if the Statement Monitor is off, connect to your source database and start a Statement Monitor for your connection:

$ db2 update monitor switches using statement on

Get a snapshot baseline of the prepared statements and write the snapshot to a file.

$ db2 get snapshot for dynamic sql on <database> > snp.1

Run your query, and get a second snapshot of prepared statements written to a different file.

$ db2 get snapshot for dynamic sql on <database> > snp.2

Now you can compare the two snapshots (I tend to just use diff) to find the statements that were run. Ideally you’ll be quick enough that the second snapshot only differs from the first as a result of your statement.

From a simple join such as:

select *

from <LinkedServer>.<Database>.<Schema>.TB1 t1

inner join <LinkedServer>.<Database>.<Schema>.TB2 t2

on t1.col1 = t2.col2

where t1.col2 like ‘XXX%’

The snapshot comparison shows 3 statements have been executed. The first 2 are of this form:

select * from “<database>”.”<Schema>”.”TB1″ FETCH FIRST 1 ROWS ONLY

select * from “<database>”.”<Schema>”.”TB2″ FETCH FIRST 1 ROWS ONLY

The third statement is a transformed version of your statement. It looks as if SQL Server makes a simple query against each table in your query to return returning one row to get the result set metadata. Then it prepares and runs your actual query. The single row, initial queries should execute very quickly and only add a very small overhead.

If you are using Linked Servers I would very strongly recommend that you check what is being passed down to the source databases. Against database engines such as DB2, Oracle or mySQL it’s simple enough to do and the time spent might well be a great deal less than the time you spend waiting for poorly optimised statements to return results.

The Microsoft OLEDB Provider for ODBC using an ODBC connection to DB2.

ODBC was supposed to have been superseded so long ago that it seems bizarre to be writing about it, but for completeness here we go. This works in part. As with the IBM OLE DB Provider, you need to start by installing the DB2 Runtime Client and again you should catalog the remote node and database that you are interested in.

Next use the IBM DB2 ODBC Driver associated with your local DB2 client instance to create an ODBC System DSN. Use the Test Connection option to make sure you’re getting through to the DB2 instance.

The final stage is to create the linked server using OLE DB Provider for ODBC (MSDASQL). Here the only parameters you need to specify are:

  • The Provider (from the drop down list).
  • Your ODBC DSN as the Data Source.
  • Some user credentials in the Linked Server Security tab.

The only way I can get this to work is by using the openquery function, e.g.

select * from openquery(<ODBC DSN>,’select count(*) from SYSCAT.TABLES’)

Any attempt at using the 4 part name syntax fails with:

Msg 7313, Level 16, State 1, Line 1

An invalid schema or catalog was specified for the provider “MSDASQL” for linked server “<ODBC DSN>”.

Advertisements