Windows IT Pro is the authoritative and independent resource for windows nt, windows 2000, windows 2003, windows xp. Features a collection of resources and magazines for windows IT professionals.
  
  
  Advanced Search 


February 1997

Making the Most of BCP-Seven Tips for Speeding Large Data Loads with Bulk Copy Program


RSS
Subscribe to Windows IT Pro | See More SQL Server and Database Articles Here | Reprints | Or get the Monthly Online Pass—only $5.95 a month!

Speeding Large Data Loads with Bulk Copy Program

The Bulk Copy Program (BCP) isn't sexy or exciting, and it isn't the subject of many articles. Still, it's the most practical way to handle high-speed data loads in SQL Server. This month I'll explain what BCP is and give some tips to help squeeze every drop of performance from it.

BCP is Microsoft's version of an export/import utility. (For a custom file export utility that combines Visual Basic 4.0's TreeView control with SQL Server 6.5's DMO BulkCopy object, see Mike Otey, "Exporting Data from SQL Server," January 1997.) This command-line program moves data between an existing database table or view and an operating-system file stored in ASCII or machine format. Machine format is useful only for transferring data between SQL Servers running on the CPU architectures (i.e., Intel, Alpha, PowerPC); people can't read machine format. BCP has many uses, but it's especially helpful for moving large data sets between SQL Server and other database systems or dealing with extracts from external data feeds or legacy systems.

Sometimes people think BCP is part of the SQL Server engine and try to run commands from ISQL/w or a similar interface. BCP isn't a Transact-SQL command, and it's not part of the server. BCP is a C-language client that communicates with the database using special BCP extensions in the DB-Library (DB-Lib) API (a SQL Server-specific programming interface). Importing or exporting data with an Open Database Connectivity (ODBC)-enabled application such as Microsoft Access is often easier because Microsoft doesn't provide a GUI interface for BCP and the command-line syntax can be arcane and difficult to master. ODBC-enabled applications are OK for small data sets, but you need to avoid ODBC applications when you're bulk loading a lot of data. ODBC applications load data using standard Insert statements, which are many times slower than BCP.

Increasing BCP's Performance
BCP performance can vary significantly with different uses. The following tips can help make BCP fly.

Tip 1: Always use fast BCP.
You load data with BCP in either fast or slow BCP mode. As you can guess, fast BCP is quicker, but you pay a price. Fast BCP forces you to set a database option, Select into/bulk copy with sp_dboption. This option lets nonlogged operations occur; don't leave it on in production databases because it can prevent you from backing up your transaction log. (See SQL Server Books Online--BOL--Effect of Select into/bulk copy on Transaction Logs for more information.) Fast mode also requires you to drop any indexes on the target table. BCP will revert to slow mode if you forget to remove even one index. You can ignore the BOL and third-party references telling you that triggers on a target table also cause BCP to use slow mode: SQL Server 6.0 fixed this problem.

Transaction log activity accounts for the big difference in speed between slow and fast BCP. Slow mode causes almost as much logging as adding data with an Insert statement, whereas fast BCP does not log individual rows but logs new space allocations only when extents are linked into the table's page chain. (Extents are 16KB buffers, blocks of eight pages, used to manage data.) The difference between fast and slow

BCP is so great that data loads are usually faster if you drop the indexes, run fast BCP, then re-create the indexes, instead of using slow mode and fully logging all the data insertions. If your table has a clustered index, sort the data before running BCP. Then create the index using the sorted_data option, which tells SQL Server it doesn't need to re-sort the data when it creates the index.

Tip 2: Run BCP from the server.
Running BCP on the same machine as SQL Server is a great way to boost performance because it eliminates tons of network overhead. Consider the following scenario: Server A runs SQL Server and stores the data files to be loaded. You invoke BCP from Workstation X and load data into Server A. BCP does not run as part of SQL Server, so the data is copied first to Workstation X, which immediately sends it to back to Server A to be loaded. You move a large data file around the network twice, once to the BCP client machine and once to the SQL Server, but you can eliminate the traffic by running BCP directly from the server. In most cases, copying the remote data file to the server with NT's ordinary file copy commands and then running BCP locally is faster than BCP'ing across the network because NT's caching and file transfer mechanisms are more efficient. Even if you don't run BCP from the server, avoid running BCP against a remote data file because this process moves the file around the network twice.

Tip 3: Use local named pipes.BR> When BCP runs on the same machine as SQL Server, using local named pipes greatly speeds the process. Local pipes are an interprocess communication (IPC) mechanism and completely bypass the network to optimize processes running on the same machine. Local pipes act like a TCP/IP loopback so data never goes to the NT Redirector; by comparison, network named pipes send data through the Redirector, even if both processes run locally on the same machine. (You can easily see the difference in Redirector activity in Performance Monitor--Perfmon--by looking at the Bytes Total/sec counter in the Redirector object.) BCP automatically uses a local named pipe if you do not provide a server name when you run the command.

Compaq has a great white paper, Configuration and Tuning of Microsoft SQL Server 6.5 for Windows NT on Compaq Servers (http://www.compaq.com/support/techpubs/whitepapers/415a0696.html), full of in-depth tuning information. The paper compares BCP load times using a variety of packet size and IPC configurations; the comparisons show that local pipes are 300 percent faster than network pipes.

Tip 4: Place BCP and SQL Server data on separate disks.
Running BCP through a local pipe reduces network overhead but can introduce disk I/O-related bottlenecks. Open the throttle by writing your BCP data files to a fast RAID array on a different physical drive from your SQL Server devices. If this approach is impractical in your environment, run benchmarks with your data and your servers to determine which is worse: the network overhead involved with a remote BCP or the contention on the disk while BCP and SQL Server fight for the same I/O bandwidth.

Tip 5: Install Service Pack 1 for SQL Server 6.5.
Service Pack 1 (SP1) for SQL Server 6.5 includes two nifty enhancements that improve BCP load times up to 700 percent and provide throughput rates of approximately 3.5MB per second (MBps), according to Microsoft benchmarks. First, an engine optimization lets BCP write an extent at a time using n private buffers rather than a page at a time using a single public buffer area. You can set n between 1 and 10 using

sp_configure 'backup buffer size'

This action has a big positive impact on speed, if your disk subsystem can handle the load. Experiment to find the best setting.

Enabling the new Table lock on bulk load option in sp_tableoption activates the second improvement, which tells SQL Server to hold one exclusive table lock rather than grab individual locks as each new extent is linked into the page chain. Previously, BCP locked at the extent level, which could deplete the number of configured locks when a program is loading large data sets. Locking at the table level lets SQL Server acquire a single lock for the duration of the BCP load, so you can load data using much larger batch sizes without exhausting the fixed supply of locks. So, set the batch size to the number of rows in the table you're loading. These two enhancements speed data loading with BCP in the fast, nonlogged mode; unfortunately, you won't see much change when you use the slow, logged version of BCP.

The changes open BCP's floodgates to the I/O subsystem, so the boost is negligible unless your controller and disks can keep up with the increased I/O load. After playing with SP1 and talking with others who have used it, I believe the 700 percent gain Microsoft has achieved is optimistic for most sites. A few Microsoft engineers I know have reported 200 percent to 300 percent improvements using fast RAID controllers and lots of disks at corporate sites. In comparison, load times improved by 30 percent on my lowly P120 40MB laptop.

Tip 6: Experiment with packet size.
Before SP1, packet size had little effect as long as the packet size was at least 4KB. A Microsoft engineer familiar with the SP1 BCP enhancements suggests using a packet size of 16KB to take full advantage of BCP's new ability to write data in 16KB extents rather than 2KB pages.

My tests show that 16KB packets improve times slightly, but I haven't tested packet sizes exhaustively. Try playing with this option in your environment.

Tip 7: Use native mode.
BCP can work with data in native SQL Server format or in an ASCII representation, which is more fun to look at. Native format uses slightly less disk space and runs a little faster because data isn't converted to ASCII format. Don't expect the same boost as using a local named pipe, but every little bit can help.

Put 'Em All Together
BCP tuning is one part science, one part art, and three parts experimentation, so I can't guarantee success. But I'd be very surprised if these tips don't improve performance at your site.

BCP isn't the most user-friendly interface in the world, and we all hope that one day Microsoft will provide a more attractive, easier-to-use utility. But for now, BCP is the best game in town, so you might as well make the most of it.

End of Article



Reader Comments
Congratulations on Brian Moran’s great February article, “Seven Tips for Speeding Large Data Loads with Bulk Copy Program.” BCP is a useful SQL Server tool that I have worked with in SQL Server 6.0. I’ve taken advantage of Moran’s tips 1, 2, and 7 to speed up BCP. I ran into a problem from using BCP to quickly move data into a table with existing data. I’ve corrupted the existing data in the table. Microsoft Help desk acknowledged this problem and suggested that I place an index on the table. I argued that this unacceptable workaround would revert back to the slow BCP mode. The response was basically, “You’re out of luck buddy.” Do you have any workarounds?<br>
--Long T. Duong <br><br>

<i>I’m glad some tips were helpful. BCP (a.k.a. Buggy C Program) can be unfriendly at the best of times. BCP has many known bugs, so troubleshooting your problem is difficult without specifics. In some cases, acceptable workarounds don’t exist. You mention SQL Server 6.0. Upgrade to 6.5, because the 6.0 release was infamous for annoying BCP bugs. Many people worked around 6.0 BCP problems by using the version of BCP that shipped with SQL Server 4.2. <br>
--Brian Moran</i>

Long T. Duong August 12, 1999


I just read Brian Moran’s February article, “Seven Tips for Speeding Large Data Loads with Bulk Copy Program,” and found the information excellent! Thank you. I have one question: If the procedure is going to be a system stored procedure, why not just create it with <i>allow updates</i> turned on and then disable this option? The stored procedure will still be able to update system tables when you run it. Reconfiguring the server each time the procedure runs seems unnecessary and potentially dangerous as this is a serverwide configuration. Is this concern valid?<br>
--Paul Armstrong<br><br>

<i>You’re absolutely right! Leaving the sp_configure allow updates command in the main body of the stored procedure was a mistake. I put it there to simplify testing while I was writing the article, and removing it somehow slipped by me. Yes, it’s much better to create the procedure with the allow update option enabled, because the procedure will still be able to modify system tables when the option is disabled. Thanks for pointing this error out. Now I know at least one person read the article! <g> I’m glad the password security information was helpful. <br>
--Brian Moran</i>

Paul Armstrong August 13, 1999


I have enjoyed Brian Moran’s articles. The February article, “Seven Tips for Speeding Large Data Loads with Bulk Copy Program,” was particularly helpful because I find myself wrestling with bcp. I am amazed at how little documentation on bcp exists when people rely on it so much. You’d think Microsoft would do something about that.
With our product, we send out some standard data files that get bcp’d during our product’s install procedure. We need this procedure to work on international versions of NT as well as English (US). However, I prefer not to put together country-specific versions of our data files. This issue also comes up when our application runs and reads in character data (including dates) from an outside system.
I understand that bcp takes its month names from the NT operating system. For instance, bcp running on German (Standard) NT wants to see Dez as the abbreviation for December. I decided to focus on numeric type dates. But, bcp insists that numeric dates follow the format MM/DD/
YY or MM/DD/YYYY (or you can use a dash as a separator) for US English. However, bcp on German NT wants DD/MM/YY or DD/MM/YYYY.
I was hoping that bcp had a country-independent, universal date format—for example, YYYY/MM/DD. However, bcp refuses to take a date if the year is first.
Any ideas? I need the speed of bcp.<br>
--Scott Williams<br><br>

<i>You can use a simple trick to change the date and language defaults bcp uses. Typical user connections can use the SET command to change the default date format, but this method works for only the current connection, and you can’t issue SET from the bcp command line. But you can get bcp to use any date and language defaults you want when you create a new language in SQL Server. Create a new user called BCPUser (or something similar) and set the default language of BCPUser to the language you just created. SQL Server ships with support for many languages and code pages, and you can always use these as templates to create your own. (Books Online explains the process of creating languages, so I won’t cover that.) <br>
--Brian Moran</i>

Scott Williams August 13, 1999


BCP...BCP...BCP....

I really need a workaround to solve the error i am getting while running BCP.
The error is ..
SQLState = S1000, NativeError = 0
Error = [Microsoft][ODBC SQL Server Driver]Unable to open BCP host data-file

The command i use is ...

Select @strSQL=''
Select @strSQL = 'master..xp_cmdshell ''bcp ' + ltrim(rtrim(@DBName)) + '.dbo.rpl_UploadData in ' + char(34) + @FullPath + '\' + ltrim(rtrim(@UploadFileName)) + char(34) + ' -c -q -F 1 -L 1 -SMyServer -Uuser1 -Ppwd'''
Execute (@strSQL)

Where DBName is the current database name, FullPath is the UNC path of the file location and the UploadFileName is the FileName. The above statements were working perfectly few months back, but now suddenly getting the host data-file error.

Any suggestions would be great help.



Sundar Rajan Krishnamurthy June 21, 2000


Dear Brain Moran

I enjoyed reading your article and found it most useful.

We are having problems with a few of our BCP imports and I wonder
if you can either help or point us in the right direction.

Using SQL Server 6.5.
When using BCP on some of the larger data files, we get a windows DLL error.
The DLL in question is: NTWDBLIB.DLL. The error message is:

"BCP caused an invalid page fault in module NTWDBLIB.DLL at 0137;10027e95."

During testing, It seems to work if we reduce the number of fields
we are trying to import. Is there a upper limit on the number of
fields that are allowed to be imported by BCP? Any help or pointers
would be greatly appreciated.

Thanking you in advance for your help.

nikita munn July 21, 2000


I have searched high and low for an easy way to do a simple conversion. Each time I end up importing the data into a table and writting a stored procedure to do it, but I know the BCP command can do it. I can BCP out a table with percentages like 1.00 or .75, and they end up like 100 or 75, which is fine. But when I attempt to BCP in that data, I can't ever get it to convert it back. Please help me figure out how to do the same conversion that happened on the way out.

Thanks


Eddie Panther March 27, 2003


You must log on before posting a comment.

If you don't have a username & password, please register now.




Top Viewed ArticlesView all articles
No Jobs, No Excitement at Apple's Last Macworld Keynote

Apple CEO Steve Jobs made the right move in skipping out on his company's last appearance at Macworld: In a Tuesday keynote address at the conference, Apple had no interesting new products to sell, opting instead to spend mind-numbing amounts of time on ...

Command Prompt Tricks

One reader shares his tip for setting up the command prompt to reflect a remote path. ...

Where is Microsoft NetMeeting in Windows XP?

...


SQL Server and Database Whitepapers StoreVault SnapManagers for Microsoft Exchange and SQL Server

Related Events Virtualization Forum: Optimizing Storage, Networks, Desktops, and Security

Cloud Computing Forum: Integrating Software, Server and Storage as a Service into Your Enterprise IT Delivery Model

Virtualization Forum: Optimizing Storage, Networks, Desktops, and Security

Check out our list of Free Email Newsletters!

SQL Server and Database eBooks Safeguarding Your Windows Servers

SQL Server Administration for Oracle DBAs

Taking Control: Monitoring the Windows Platform Proactively

Related SQL Server and Database Resources Become a VIP member of the Windows IT Pro community!
Get it all with the VIP CD and VIP access. A $500+ value for only $279!

Subscribe to Windows IT Pro!
Solve your toughest technical problems with our experts and access 10,000 + articles online. 30% off

Monthly Online Pass - Only $5.95!
Get instant access to 10,000+ articles from Windows IT Pro Magazine!

TechNet Virtual Labs
Evaluate and test Microsoft's newest products.


Windows IT Pro Home Register FAQ for Windows WinInfo News
Europe Edition About Us Contact Us/Customer Service Media Kit Affiliates / Licensing  
SQL Server Magazine Office & SharePoint Pro Windows Dev Pro IT Job Hound ITTV
IT Library Technology Resource Directory Connected Home Windows Excavator Windows SuperSite 
 
 Windows IT Pro is a Division of Penton Media Inc.
 Copyright © 2009 Penton Media, Inc., All rights reserved. Terms and Use | Privacy Statement | Reprints and Licensing