Working with Linked Servers

by Bill Wunder

Linked servers allow you to query virtually any OLE DB data source from T-SQL running on a SQL Server 2000 creating a tremendous backdrop for distributed data architectures. If the OLE DB provider provides the necessary interfaces you can also insert, update, and delete data at that OLE DB data source. Reciprocally, SQL Server is also an OLE DB data source that can be accessed through quite a variety of OLE DB Providers including linked servers from other SQL Servers.

There really are quite a number of interesting possible OLE DB data sources. The better known include Excel, Access, Oracle, DB2, and ODBC. The less well known include LDAP, Active Directory, Exchange, Outlook MySQL, HTML and VSAM. That is certainly not an exhaustive list. In reality give a good C/C++ programmer enough time and they can turn just about anything that can be connected to a computer into an OLE DB data source. There's a nice list of commercial non-Microsoft OLE DB providers at sqlsummit.com and you may also want to consider this well discussed listing of most current Microsoft OLE DB providers in the MSDN Library and this discussion of OLE DB Providers Tested with SQL Server in Books Online where you will find examples of how to configure a connection some common and interesting linked server OLE DB providers. Microsoft provides the SQLOLEDB provider for connection to a SQL Server 2000 as the data source.

Before linked servers, earlier releases of SQL Server supported distributed queries to other SQL Servers using "Remote Servers" via RPC. This required configuration at both the data serving and the data requesting SQL Server and had all the performance, scalability, and security issues of the underlying RPC protocol. Remote Servers are still available in SQL Server 2000 but provided "for backward compatibility only". Looks like they are still there in Yukon beta 1 but if you are using them now, it's a good idea to begin migrating remote servers to linked servers.

Hopefully, in SQL Server 2005 we will be able to use the native .Net providers when creating linked servers and enjoy the many advantages of that provider. Dino Esposito has a very nice online article in Visual Studio magazine: "Create More Efficient Database Code" on the .Net provider and how it differs from the SQLOLEDB provider. Since linked servers are not yet implemented in the Yukon beta 1 release, I guess we'll just have to wait and see if T-SQL linked servers will gain the new .Net powers and efficiency.

So what is an OLE DB provider anyway? According to the "OLE DB Programmers Reference" of Microsoft's MSDN library:

OLE DB architecture is built upon the precept of an application accessing diverse data stores through a small application built specifically for that purpose. The application that uses OLE DB functionality is called the consumer, while the one that accesses the data by exposing OLE DB interfaces is called the provider.

This leads us to an important point when we talk about using linked servers in T-SQL queries. Using a linked server makes the SQL Server where the linked server call is executed the consumer of the provided data. That means that we are asking a SQL Server to be a middle man for data. The network loading and security issues of this can be huge! I have seen many cases where a developer will use a linked server to get data from a remote source rather than making a connection to that source directly from his application. This is just laziness - and not the good fourth lazy form kind of lazy either! - it is simply wrong. SQL Server is..., well..., a server not a client application or a mid tier transport utility. It should be obvious to anyone that moving data from an OLE DB data source to a SQL Server to your application is less efficient and creates a more complex security model than moving data from an OLE DB data source to your application.

OK, so what are Linked Servers good for then? If you need to join data from two or more different data sources linked servers are your friend. Even with this usage linked servers can carry some performance and security costs. Basically what you want to do is use the principle of least privilege and then make sure the record sets transferred on a linked server connection are small or make sure that the rows are moved across the linked server connection as a "fire hose" or fast forward only cursor. That's right! If you watch a linked server call that uses the SQLOLEDB provider to get a multi-row data set from another SQL Server in SQL Profiler trace on that remote server you will see that the data is transferred using a Server API cursor. Just like cursors in any other context, they can slow you down. There is more than one way to work with a query to produce a "fire hose" from the linked server data source. One consistently useful method is to insert the rows into a temp table and then process the necessary join as a local query using the temp table rather than joining across the network. Bottom line here is that good testing will result in good results.

You can configure linked servers using Enterprise Manager but - as with almost anything else you do in SQL Server - I recommend using the system stored procedures to configure linked servers. The system stored procedures are:

 
sp_addlinkedserver - You'll find yet another listing of the OLE DB provider names listed on the sp_addlinkedserver help page in Books Online. Just remember that the preferred provider to link to SQL Servers is SQLOLEDB. Another interesting tip is that you do not need to specify a catalog when defining a SQLOLEDB linked server. The advantage is that the linked server will not "break" if a database is dropped or it's name is changed. That can be a very obscure problem. The limitation is that this does require the database name be provided in all linked server queries. And that should be a best practice anyway.

sp_addlinkedsrvlogin - There are a few security considerations when creating linked servers that you must address when creating logins. Using linked servers on trusted connections without the benefit of Active Directory Delegation can be a real pain. It's definitely a security nightmare, but you can add [NT AUTHORITYANONYMOUS LOGIN] as a trusted login at the data source and grant permissions to this ambiguous login to work around the delegation issue in an environment without delegation. If a SQL login has the same name but different passwords you'll have to map that user as to the same user name with the other password in the linked server login configuration. That also means when you change the password you have to manually update the password in the linked server configuration. You can also map users to other users in the linked server. Be careful because that could get confusing in a hurry.

sp_droplinkedsrvlogin - Keep in mind that when using system stored procedures you must drop the logins to a linked server before you can drop the linked server. Enterprise Manager takes care of all of that for you behind the scenes.

sp_dropserver - Since linked servers are store in master.dbo.sysservers along with remote servers and even the local instance, use the same procedure to remove any of them.

sp_linkedservers or sp_helpserver - sp_helpserver will show you linked servers as well as remote servers. 

sp_serveroption - This is an important one that can slip through the cracks. Use this procedure to establish "data access" - the ability to directly select, insert, update, and delete - across the linked server, "rpc" to call stored procedures on the link, and even connection time out and query time out settings that unfortunately default to unlimited in SQL Server 2000.

 

I use a stored procedure that calls the system stored procedures to assure that the linked servers between the SQL Servers in my shop are all the same. I have also experienced occasional corruption of a linked server so having a stored procedure available that will quickly reproduce a linked server has proven invaluable in crisis mode. I also take advantage of the ability to alias linked servers. So for example if I have an application stored procedure that must reference another server I can create a linked server with the same name in development, acceptance and production so that the application stored procedure does not have to be edited when promoted from development or acceptance.    

Finally, you may be wondering how are linked servers be used in T-SQL code. You can use linked sever in fully qualified or four part named references (linked server.database.owner.object)  or you can use OPENQUERY to execute pass through queries. Using OPENQUERY can help you write those fire hose queries I mentioned earlier without even needing to think about it. Unfortunately, many developers find the OPENQUERY syntax confusing and avoid using it.

To sum it up, linked servers can be very useful, but are better used with small distributed data sets that large data collections. My preference is that linked server calls in applications are looking for a single row or value on the remote data source. Of couse, as everyone likes to remind me, I don't always get what I want.

Bill