|
"Things would be different if I ran the zoo." -- Dr. Seuss
|
|
|
|
|
Too young (and broke) for that.
|
|
|
|
|
I've seen the reference implementations for dictionary and sorted dictionary. I've even fiddled with them in binary form way back when with Reflector. I expect the inserts and removal times to be longer - i'd be surprised if they weren't, but the searches? This is just wrong.
(The AVL tree implementation sucks - it will be improved)
Dictionary adds: 2.3903ms
Sorted Dictionary adds: 4.5114ms
SortedAvlTreeDictionary adds: 845.7137ms
SortedBTreeDictionary adds: 6.4452ms
Dictionary searches: 0.6058ms
Sorted Dictionary searches: 1.9608ms
SortedAvlTreeDictionary searches: 2.4709ms
SortedBTreeDictionary searches: 1.8993ms
Dictionary removes: 0.5806ms
Sorted Dictionary removes: 3.5787ms
SortedAvlTreeDictionary removes: 676.9056ms
SortedBTreeDictionary removes: 4.5226ms
There's NO WAY this is happening without some black magic. Like some specialized op for their dictionaries.
WTH
*EDIT*
changes tests to 1 million items per dictionary
Oh...
Dictionary adds: 514.7477ms
Sorted Dictionary adds: 1367.028ms
SortedBTreeDictionary adds: 1039.1983ms
Dictionary searches: 88.4017ms
Sorted Dictionary searches: 410.1569ms
SortedBTreeDictionary searches: 138.451ms
Dictionary removes: 84.7735ms
Sorted Dictionary removes: 725.1243ms
SortedBTreeDictionary removes: 350.6078ms
WOOT
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
modified 11-Sep-19 22:31pm.
|
|
|
|
|
holy socks batman
10 million rows. now i need a beer. I don't drink but still.
Dictionary adds: 5221.7078ms
Sorted Dictionary adds: 11082.8459ms
SortedBTreeDictionary adds: 10185.292ms
Dictionary searches: 586.3837ms
Sorted Dictionary searches: 3195.7039ms
SortedBTreeDictionary searches: 761.9951ms
Dictionary removes: 664.734ms
Sorted Dictionary removes: 6678.1001ms
SortedBTreeDictionary removes: 3710.8476ms
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Reported as inappropriate for the Lounge.
«One day it will have to be officially admitted that what we have christened reality is an even greater illusion than the world of dreams.» Salvador Dali
|
|
|
|
|
How'd that work out for you last time?
I hope you make a better dev than a hall monitor.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
I respectfully refer you to the first four words of point #2 of the sticky:
Chris Maunder wrote: Technical discussions are welcome ... It used to be in the explanatory text at the top too.
Cheers,
Peter
Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012
|
|
|
|
|
@chris-maunder
Thank you for the polite tone of your response, Peter.
I am aware of Chris' comment on the prior post, and I do not interpret it as being a "blanket" invitation for posts, like this one, which are nothing but code. There is no "discussion" inherent in this post.
I take the privilege of being a "mentor" on CP seriously, and when I see content like this, which could contribute to the quality of CP in the long run ... if posted on the appropriate technical forum ... I will speak out.
cheers, Bill
«One day it will have to be officially admitted that what we have christened reality is an even greater illusion than the world of dreams.» Salvador Dali
|
|
|
|
|
So, your initial post was serious ? I thought you were joking.
I welcome these kinds of posts about algorithmic in the Lounge. It was not a question, it was an open remark about performance.
Well, I can't believe this discussion just takes place.
|
|
|
|
|
The .NET BCL code has 2 advantage.
1. it runs in release mode (yours probably run in debug mode)
2. it has been precompiled ahead of time. (i.e. NGEN). NGEN does a better job sometimes.. though not sure it does that much better on library... (as opposed to program)
- (bonus point) well you said you check the reference implementation so might not be true. but .NET BCL is heavily optimised (I used reflector sometimes and I was surprised by the length they occasionally go to to make common codepath faster)
|
|
|
|
|
Well the base dictionary i expect to be faster because it doesn't have to sort the keys.
But I was surprised to see SortedDictionary (red black tree AFAIK) doing as well as it was, at least until I started throwing huge data at it.
Finally, the results started to make sense as I increased the order from 10k to 100k and beyond.
My AVL implementation has a really bad flaw in it so the perf results on it are no good at the moment.
I should add, the types of performance improvements you're talking about are generally linear time improvements, and I'm dealing with things that should yield orders of mag improvements
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
modified 11-Sep-19 23:16pm.
|
|
|
|
|
Looks like a fun way to improve your coding!
Good luck with that and improvement!
Any chance I can give you some homework?
In AvalonEdit their text model use "Rope" which looks like an IList but has fast insert and delete, i.e. O(1) cost instead of O(n)
In fact their code is probably quite nice.. but after looking at it for only 2 minutes I was confused, it's all mixed up with the text model, if I remember right...
Maybe you could be interested in making standalone Rope class?!
|
|
|
|
|
sounds like maybe a segmented linked list. where each link can contain multiple elements. that way instead of reallocing an array it just has to add a new link. Traversing is still about as fast.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
You can check for yourself if there is a voodoo in there...
Reference Source[^]
"The only place where Success comes before Work is in the dictionary." Vidal Sassoon, 1928 - 2012
|
|
|
|
|
like i said, i've seen the ref source
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
You talked about Reflection... It is no Reflection, it is the source used to compile .NET...
"The only place where Success comes before Work is in the dictionary." Vidal Sassoon, 1928 - 2012
|
|
|
|
|
No, I didn't talk about reflection. I mentioned Reflector, which is a tool used to decompile .NET assemblies. Despite the name, it has nothing to do with reflection in .NET
I know exactly what the reference source is. I said I looked at it, so of course I'd know what it is.
I've seen Dictionary.cs. I've seen SortedDictionary.cs
I've also seen the decompiled contents of mscorlib and system.dll
so i know what this code looks like. BTW, the decompiled versions are more optimized than the reference source.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
honey the codewitch wrote: the decompiled versions are more optimized than the reference source
Which is a + for the compiler
I see you went through the code - so why talking of voodoo?
"The only place where Success comes before Work is in the dictionary." Vidal Sassoon, 1928 - 2012
|
|
|
|
|
because their code is performing better than expected for a red/black tree (which is what they're using in SortedDictionary) compared to my SortedBTreeDictionary
although mine destroys Microsoft's for searching after about 30,000 items.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
honey the codewitch wrote: although mine destroys Microsoft's for searching after about 30,000 items.
Exactly this is the reason for using B-trees in a database.
They're not very sensitive to size, which is something you regularly get in a database. While more than 30000 items is something you very seldom need to (or should) handle in memory.
|
|
|
|
|
Jörgen Andersson wrote: While more than 30000 items is something you very seldom need to (or should) handle in memory.
That's getting less true. What's funny is by my tests, .NET is just fine with 3 million entries spread across 3 different dictionary classes.
The heap isn't as big as you'd expect and the performance is really good for both the base Dictionary class (which is basically unsorted, and uses a hash lookup) and for my class, while not being unsurprising for the other class.
Times are changing. In memory DB is totally doable even in C#, for smaller dbs.
I was thinking of backing JSON with something like this, or implementing a full B+ with backing storage for it.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
That will be quite interesting.
|
|
|
|
|
I had 10 million entries in each of these collections last night. The B-tree rocked.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Looking forward to an article.
|
|
|
|
|
I plan to but i want to make a little suite of sorted dictionaries for that.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|