Click here to Skip to main content
15,885,771 members
Articles / Programming Languages / SQL
Article

The Keyspace Problem

Rate me:
Please Sign up or sign in to vote.
2.22/5 (21 votes)
20 Jul 20063 min read 51.9K   15   17
Just how many customers can you have in your database?

Introduction

If I had a nickel for every time I have written an application with a customer object mapped to a database ...

In a properly normalized database, the issue I am going to present does not occur. However, in every application I have worked on, the problem has existed, and while 64 bit keys will virtually eliminate it in the near future, it is still a cause for concern in systems using 32 bit keys and less (think micro databases on handheld computers, or legacy VB6 apps.)

The problem

Imagine the following database schema. For the sake of this article, the ID columns are identity columns of 8 bits (makes for easier math).

Diagram of a Database Schema

In this database, I will now insert five customers. I will only show customer 1 for brevity sake.

Diagram of a Customer 1

A quick inspection shows us that for the first customer, we have now incremented the order ID table by 3 and the line items table by 15. Now, I will insert 17 more customers and get an error. Where did the error come from? View this illustration of customer ID 17:

Diagram of a Customer 17

A quick analysis shows that the keyspace for line items has been exhausted!

Doing a formal analysis, a formula can be derived:

floor(2 <SUP>k</SUP> /(Π<SUB>1...n</SUB>X));

Plugging in the numbers, 28 /(5*3) = 17; which just so happens to be exactly the number of customers we ended up with!

For a brief explanation of the formula: k is the key size. In this equation, k is fixed; however, there is a more complicated and more general equation that can be derived for tables with mixed key sizes. Xi is the expected number of records in a given table for a particular parent (mean).

Conclusion

There is no substitute for formal analysis of design. This article is presented as an illustration of an all too common scenario in database design which I have seen over and over in my consultancy. While 64 bit keys effectively eliminate this issue, it is masking the problem rather than identifying it. After all, a well designed database occupies the smallest footprint possible. Using 64 bit keys on a domain of 5 items is ludicrous.

As a small addendum: this problem does exist for 32 bit key sizes, especially in Visual Basic. Imagine a medium sized company with a lot of repeat business. Oh, I don't know, how about a delivery grocery store in a city with a population of 800,000. The max draw is probably 3% of the target market, yielding a maximum customer base of 24,000 customers in the metro area. Each customer will need groceries weekly, and purchase, on an average, 150 items. (Don't believe, look at your last grocery receipt.)

232/(52*150) = 550,636 customers. Expand to 5 years of operations, and 232/((5*52)*150) = 110,127 customers.

Seemingly, there is not a problem. However, if such a business model is a success (a draw of 3% is a success), a prudent business person would expand to other cities. With this simple model, the expansion into 7 cities would cripple the database and the code it was written against.

Notes

I rewrote this article to remove the confusion about my point, on 2006/07/18.

Points of interest

A good place to find answers to hard problems: Google.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Architect ERL GLOBAL, INC
United States United States
My company is ERL GLOBAL, INC. I develop Custom Programming solutions for business of all sizes. I also do Android Programming as I find it a refreshing break from the MS.

Comments and Discussions

 
QuestionTempest in a Teapot? [modified] Pin
W Balboos, GHB31-Jul-06 10:19
W Balboos, GHB31-Jul-06 10:19 
AnswerRe: Tempest in a Teapot? Pin
Ennis Ray Lynch, Jr.31-Jul-06 10:25
Ennis Ray Lynch, Jr.31-Jul-06 10:25 
GeneralRe: Tempest in a Teapot? Pin
W Balboos, GHB2-Aug-06 2:39
W Balboos, GHB2-Aug-06 2:39 
GeneralRe: Tempest in a Teapot? Pin
Ennis Ray Lynch, Jr.2-Aug-06 4:30
Ennis Ray Lynch, Jr.2-Aug-06 4:30 
GeneralNot sure on your assumptions Pin
simon.proctor21-Jul-06 3:05
simon.proctor21-Jul-06 3:05 
GeneralRe: Not sure on your assumptions Pin
Ennis Ray Lynch, Jr.22-Jul-06 19:53
Ennis Ray Lynch, Jr.22-Jul-06 19:53 
GeneralRe: Not sure on your assumptions Pin
simon.proctor23-Jul-06 0:05
simon.proctor23-Jul-06 0:05 
GeneralThis is bad... Pin
The Dark One 66618-Jul-06 0:14
The Dark One 66618-Jul-06 0:14 
GeneralI suppose I messed up Pin
Ennis Ray Lynch, Jr.18-Jul-06 3:22
Ennis Ray Lynch, Jr.18-Jul-06 3:22 
QuestionHave you ever heard about GUIDs? Pin
..Hubert..17-Jul-06 23:27
..Hubert..17-Jul-06 23:27 
QuestionWhat's the problem? Pin
mav.northwind15-Jul-06 1:45
mav.northwind15-Jul-06 1:45 
AnswerYou would have to had worked on VB6 Pin
Ennis Ray Lynch, Jr.17-Jul-06 10:21
Ennis Ray Lynch, Jr.17-Jul-06 10:21 
GeneralRe: You would have to had worked on VB6 [modified] Pin
mav.northwind17-Jul-06 20:40
mav.northwind17-Jul-06 20:40 
GeneralMy article Pin
Ennis Ray Lynch, Jr.18-Jul-06 3:21
Ennis Ray Lynch, Jr.18-Jul-06 3:21 
GeneralRe: My article Pin
mav.northwind18-Jul-06 9:28
mav.northwind18-Jul-06 9:28 
GeneralCheck back in a few days Pin
Ennis Ray Lynch, Jr.18-Jul-06 10:43
Ennis Ray Lynch, Jr.18-Jul-06 10:43 
GeneralRe: Check back in a few days Pin
mav.northwind19-Jul-06 7:50
mav.northwind19-Jul-06 7:50 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.