Click here to Skip to main content
15,881,803 members
Articles / Programming Languages / SQL

SQL Fast GUID List Generator

Rate me:
Please Sign up or sign in to vote.
5.00/5 (5 votes)
27 Jun 2009CPOL3 min read 36K   22   6
The article shows how one can use Microsoft SQL Server to generate millions of guids in seconds.

Introduction

If you want your database to be fast with a huge amount of data, you should test it with large test database. To create large test database, you may need many guids for row ids. If you use Microsoft SQL Server for this purpose (as I do), you may want to boost performance of your generator. With this article, one can create millions of guids in seconds.

Obvious Solution

If you want to create a list of guids and put them into a SQL table, you may write the following:

SQL
--METHOD 1

create table ids(id uniqueidentifier);

declare @guidCount int = 10000;
declare @n int = 0;
while @n < @guidCount
begin
   insert ids values(NEWID());
   set @n = @n + 1;
end;

This code creates only 10 000 guids. But it takes about 1000 ms. Why it is so slow? There are three reasons.

Reason #1. Number of Transactions

Each time INSERT is executed, Microsoft SQL Server starts and commits a transaction. This is not necessary in the case of our task. So we can collect our guids in memory and then insert all guids in a single INSERT statement or we can manually start and commit a transaction.
First look at how we can collect guids into the memory:

SQL
--METHOD 2.1

create table ids(id uniqueidentifier);

declare @ids table(id uniqueidentifier default NEWID());
declare @guidCount int = 100000;
declare @n int = 0;
while @n < @guidCount
begin
   insert @ids default values;
   set @n = @n + 1;
end;

insert ids
select * from @ids;

This code creates 100 000 guids and it takes about 2000 ms, 5 times faster than in first code block where transaction is committed for each guid. Now look how manual transaction affects the performance:

SQL
--METHOD 2.2

create table ids(id uniqueidentifier);

begin transaction;

declare @guidCount int = 100000;
declare @n int = 0;
while @n < @guidCount
begin
   insert ids values(NEWID());
   set @n = @n + 1;
end;

commit transaction;

This code also creates 100 000 guids and it takes about 1300 ms. So it is 1,5 times faster than generation in memory and 7 times faster than the first simple generator.
Great result, but what if you need even more guids? Then we have to review the next reason for slowness of all code blocks mentioned above.

Reason #2. Number of INSERTs

Each time INSERT is executed, there are many wasteful things to be done: compile statement, build (or get from cache) query plan, check parameters, finalize allocated objects, etc. So we need to minimize total count of INSERT executions. We really need only a fast way to generate a rowset which size is equal to number of guids we need to create. If we get one, we just select a new guid for each row and save them into a table. But how we can get a rowset without INSERTs? We can use table cross to get the rowset with required size. If we CROSS JOIN some table (say "counter") onto itself, we get squared number of rows. So If we generate a 1000-row-couter, we get 1 000 000 rows in result rowset. Now apply this approach to the guid generator:

SQL
--METHOD 3

create table ids(id uniqueidentifier);

declare @ids table(id int identity)
declare @guidCount int = 1000000;
declare @n int = 0;
while @n < SQRT(@guidCount)
begin
   insert @ids default values;
   set @n = @n + 1;
end;

insert ids
select NEWID() from @ids i1, @ids i2;

This code creates 1 000 000 guids and it takes only about 3000 ms. This is 30 times faster than the first method and 5-7 times faster than both second ones.

Great result again! But there is a way to get 1 000 000 guids more efficiently.

Reason #3. Number of Locks

When INSERT statement is being executed, Microsoft SQL Server is locking new data row by row or page by page thinking that the table might be accessed by concurrent query. If you are generating test database, there are no concurrent queries. So we may direct Microsoft SQL Server to lock whole table rather than page by page locking. Table hint WITH(TABLOCK) locks the whole table at the beginning of execution.

SQL
--METHOD 3 + WITH(TABLOCK)

create table ids(id uniqueidentifier);

declare @ids table(id int identity)
declare @guidCount int = 1000000;
declare @n int = 0;
while @n < SQRT(@guidCount)
begin
   insert @ids default values;
   set @n = @n + 1;
end

insert ids with(tablock)
select NEWID() from @ids i1, @ids i2;

This code generates 1 000 000 guids in 500 ms! That's really a great result. So one can see that this is very important to obtain a table lock before heavy INSERT execution. If you want to create even more guids - WITH TABLOCK is the only solution. In my test environment (default Microsoft SQL 2008 settings), I can’t generate more than 10 millions guids without table lock. I get an error:

Msg 1204, Level 19, State 4, Line 10
The instance of the SQL Server Database Engine 
cannot obtain a LOCK resource at this time. 
Rerun your statement when there are fewer active users. 
Ask the database administrator to check the lock and 
memory configuration for this instance, or to check for long-running transactions.

Summary

Method NoGUIDs countTime Elapsed, ms
110 0001000
2.1100 0002000
2.2100 0001300
31 000 0003000
3 + tablock1 000 000500

My Test Environment

Test environment: Microsoft SQL 2008 (default settings) on Core 2 Duo 3GHz processor and WD Raptor hard disk.

This article was originally posted at http://anton-burtsev.livejournal.com/2834.html

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Web Developer
Russian Federation Russian Federation
I have started as a C++ developer in far 2000. I'd been developing banking and treasury software for 3 years using C++. In 2003 I switched to .NET at DELL. After that I worked as a project manager on different projects (internal audit, treasury automation, publishing house automation, etc.). Since 2009 I own a small software company specialized in SaaS services and develop a DotNetNuke modules.

Comments and Discussions

 
Questionanswer Pin
qwertEHOK7-Jun-16 0:57
qwertEHOK7-Jun-16 0:57 
GeneralTry method 4 Pin
Trent Tobler28-Jul-09 9:23
Trent Tobler28-Jul-09 9:23 
GeneralEfficient mass inserts Pin
supercat929-Jun-09 8:51
supercat929-Jun-09 8:51 
GeneralRe: Efficient mass inserts Pin
Anton Burtsev20-Jul-09 2:51
Anton Burtsev20-Jul-09 2:51 
GeneralRe: Efficient mass inserts Pin
supercat920-Jul-09 4:39
supercat920-Jul-09 4:39 
GeneralRe: Efficient mass inserts Pin
Anton Burtsev20-Jul-09 20:59
Anton Burtsev20-Jul-09 20:59 
Yes, use bcp.exe on client. Here is an example
1. Create table as follows
create table someTable(n int);
2. Run bcp on client
bcp someDatabase.dbo.someTable in d:\data.txt -S mssqlserver -T -c
where
someDatabase.dbo.someTable - full name of target table
d:\data.txt - path to data on client
mssqlserver – name of computer where SQL Server sits
-T - integrated authentication
-c - character file format

There is one thing I don’t know: how to get bcp installed on client machine. As far as I know it is a part of MS SQL native client tools package. Anyway, you can install Express edition on client to use bcp.


General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.