Click here to Skip to main content
15,914,373 members

Welcome to the Lounge

   

For discussing anything related to a software developer's life but is not for programming questions. Got a programming question?

The Lounge is rated Safe For Work. If you're about to post something inappropriate for a shared office environment, then don't post it. No ads, no abuse, and no programming questions. Trolling, (political, climate, religious or whatever) will result in your account being removed.

 
GeneralRe: What flavor of voodoo is this? Pin
Jörgen Andersson11-Sep-19 22:22
professionalJörgen Andersson11-Sep-19 22:22 
GeneralRe: What flavor of voodoo is this? Pin
honey the codewitch11-Sep-19 22:25
mvahoney the codewitch11-Sep-19 22:25 
GeneralRe: What flavor of voodoo is this? Pin
Jörgen Andersson12-Sep-19 5:59
professionalJörgen Andersson12-Sep-19 5:59 
GeneralRe: What flavor of voodoo is this? Pin
honey the codewitch12-Sep-19 6:00
mvahoney the codewitch12-Sep-19 6:00 
GeneralRe: What flavor of voodoo is this? Pin
Jörgen Andersson12-Sep-19 6:02
professionalJörgen Andersson12-Sep-19 6:02 
GeneralRe: What flavor of voodoo is this? Pin
honey the codewitch12-Sep-19 7:09
mvahoney the codewitch12-Sep-19 7:09 
GeneralRe: What flavor of voodoo is this? Pin
Jörgen Andersson12-Sep-19 7:11
professionalJörgen Andersson12-Sep-19 7:11 
GeneralRe: What flavor of voodoo is this? Pin
kalberts12-Sep-19 9:04
kalberts12-Sep-19 9:04 
I read through this thread and really cannot understand what's the "problem". Where the voodoo comes in. There is very little that surprises me. Or do I mistunderstand what you are talking about?

Say you've got 4kiByte blocks, 32bit key values, then an index block holds up to 500 downwards pointers (the 96 bytes are for horizontal pointers and other sorts of management info). Root + 1 level provide pointers to a quarter million blocks. If your data records are small enough to fit 40 to the block (i.e. less than 100 bytes), this might be enough for 10 million records: Search the root index (use binary search if you prefer!), follow associade pointer to next index level. Search this root index (again binary search if you like), and there is youv'e got the pointer to the block where your data are located. The data block may have a small index with key/offset pairs to each record; this is so small that a sequential seach probably is as fast as a binary.

If blocks are at minimum filling, you may, for ten million 100 byte records, end up with two index levels below the root. If keys are not integer, but require more space, you may not be able to pack 330-500 key/pointer pairs at the index levels; this may also cause a second level to be established. But still: Three small binary searches to get to the right block isn't much work. If this is a disk structure: The root will definitely remain i cache throughout, and unless you are extremely tight on RAM, all or much of the first level below will as well. Often you use sorted structures because the records are frequently processed in sorted order. So while the second index level is not necessarily all in RAM, the relevant parts of it may be, in those cases where you do a more or less sequential processing. (If the sequential ordering means nothing to your use of the data, then you should use a hash method, not B-tree!)

You quote a single figure for both insert and removal. When you make your timings, you should make a large number of calls, quoting minimum, maximum, and average (or even more statistics-oriented figures, if that's within you field of knowledge). If you enter a million records, most of them will be super-fast; there is still room in that block for accomodating another one (key, at index level, or record, at leaf level). Every now and then, a block fills up and must be split (more often at the leaf level than at the index level, if data itself is kept in the block, not just pointers). How often this happens depends on the key size (for index blocks) and record size (for the data blocks), and the block size.

At rare occasions, the entire tree is full and the number of levels must be added. That could be quite a lot more time consuming; if the tree is to be balanced according to B-tree rules all the time, this might require some shuffeling stuff around.

Remember that B-tree rules doesn't absolutely require you to immediately float a block split up to the index level above. A search may have to skip to the next "horizontal" block, until it finds a key higher than the candidate key. Some implementations are rather lax, leaving split blocks for quite some time before updating the index. This might reduce the average add time, at the expense of (asynchronous) "cleanup" routines and somewhat higher average search times.

On the average, B-trees are quite fast for searching. But one thing that holds for all sorts of index structures: If you've got ten million new records for inserting into the tree, the right way to it is NOT one-by-one, in unsorted order. Any B-tre implementation should provide a "batch insert", sorting those ten million records by an nlogn method before inserting them into the tree, and do this in sequential key order, filling up leaf blocks from one end, one block at a time, without worrying about the higher index levels, linking in new blocks through horizontal pointers as needed. Once the bottom layer is done (without concern for higher level indexes), you go up one level to insert pointers all the new blocks, completing that level before ascending further, level by level, up to the root.
GeneralRe: What flavor of voodoo is this? Pin
honey the codewitch12-Sep-19 10:45
mvahoney the codewitch12-Sep-19 10:45 
Generalwell gosh, I feel dumb now. Pin
honey the codewitch11-Sep-19 13:01
mvahoney the codewitch11-Sep-19 13:01 
GeneralRe: well gosh, I feel dumb now. Pin
Super Lloyd11-Sep-19 14:02
Super Lloyd11-Sep-19 14:02 
GeneralRe: well gosh, I feel dumb now. Pin
honey the codewitch11-Sep-19 14:04
mvahoney the codewitch11-Sep-19 14:04 
GeneralRe: well gosh, I feel dumb now. Pin
Super Lloyd11-Sep-19 14:10
Super Lloyd11-Sep-19 14:10 
GeneralRe: well gosh, I feel dumb now. Pin
honey the codewitch11-Sep-19 14:11
mvahoney the codewitch11-Sep-19 14:11 
GeneralRe: well gosh, I feel dumb now. Pin
Super Lloyd11-Sep-19 14:14
Super Lloyd11-Sep-19 14:14 
GeneralRe: well gosh, I feel dumb now. Pin
Super Lloyd11-Sep-19 14:13
Super Lloyd11-Sep-19 14:13 
GeneralRe: well gosh, I feel dumb now. Pin
honey the codewitch11-Sep-19 14:14
mvahoney the codewitch11-Sep-19 14:14 
GeneralRe: well gosh, I feel dumb now. Pin
Super Lloyd11-Sep-19 14:17
Super Lloyd11-Sep-19 14:17 
GeneralRe: well gosh, I feel dumb now. Pin
Super Lloyd11-Sep-19 14:16
Super Lloyd11-Sep-19 14:16 
GeneralRe: well gosh, I feel dumb now. Pin
TheGreatAndPowerfulOz11-Sep-19 14:44
TheGreatAndPowerfulOz11-Sep-19 14:44 
GeneralRe: well gosh, I feel dumb now. Pin
honey the codewitch11-Sep-19 14:44
mvahoney the codewitch11-Sep-19 14:44 
GeneralRe: well gosh, I feel dumb now. Pin
TheGreatAndPowerfulOz12-Sep-19 2:38
TheGreatAndPowerfulOz12-Sep-19 2:38 
GeneralRe: well gosh, I feel dumb now. Pin
honey the codewitch12-Sep-19 8:50
mvahoney the codewitch12-Sep-19 8:50 
GeneralRe: well gosh, I feel dumb now. Pin
TheGreatAndPowerfulOz12-Sep-19 12:06
TheGreatAndPowerfulOz12-Sep-19 12:06 
GeneralRe: well gosh, I feel dumb now. Pin
honey the codewitch12-Sep-19 12:09
mvahoney the codewitch12-Sep-19 12:09 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.