|
Fair enough, although my previous offering was similar but less efficient, forcing me to rethink how queries worked, redesign, and rewrite to make something better. That's the main difference between now and a few days ago with my last article. Oh well, maybe I've just approached article saturation point WRT to JSON, but what I've been doing lately is heavy research into better ways to do it which leads me to produce a lot of CP content.
Real programmers use butterflies
|
|
|
|
|
honey the codewitch wrote: maybe I've just approached article saturation point
Take a break. Relax. Enjoy the Holidays!!
Your work has merit, purpose, and place. Just give it time, and don't fret about it so much.
Just my opinion.
Cheers.
|
|
|
|
|
I especially like to occupy myself during the holidays because at least to me it mainly feels like mass retail therapy and I find that depressing. I'm visiting my sister after the holidays (due to the 'rona) so I can dump toys on the nephew - I'm guilty of falling into the retail trap too but i do it for the kids. Anyway I'll enjoy it then.
Real programmers use butterflies
|
|
|
|
|
If you want accurate download numbers then you might want to remove the github link. I pulled your source code off github six hours ago when I commented on your last thread.
Best Wishes,
-David Delaune
|
|
|
|
|
That's true. I like linking to my github if I release something while I'm still working on it.
What did you think of the article? Was it digestible, understandable?
I'd love your thoughts on it because it's so different than how traditional JSON processors work.
Real programmers use butterflies
|
|
|
|
|
honey the codewitch wrote: What did you think of the article? Was it digestible, understandable? It reads like a manual in the beginning... which is what I would be looking for if I were using your code. Beginning with 'Coding this Mess' is where I was able to immerse into your development story.
honey the codewitch wrote: I'd love your thoughts on it because it's so different than how traditional JSON processors work. I don't have an opinion on the code. I would need to do a full review.
I would like to add that the fastest known JSON parser in the world right now is simdjson[^]. But it requires the Intel AVX512[^] instructions.
Best Wishes,
-David Delaune
|
|
|
|
|
Yeah I don't use specialized instructions because this isn't about bit twiddling, but coming up with an algorithmic improvement over traditional JSON processing.
The other thing about my library is its priority is efficient RAM use. It's second priority is raw speed.
Although, I'd still stack it up against most if not any JSON processor in terms of speed because it does partial parsing. Also in terms of when it does parse, my library's primary speed advantage is it only reads a string once, not twice like most libraries do - once to get it off the "disk" (input source), and then again to compare it. It does all string comparisons in a streaming fashion right off the "disk" (input source) like that.
I'd be curious about simdjson because it's the only one I've found that might be competitive, but my problem with it is RAM use. It's demand/lazy parsed, but it still parses into memory. I don't. The only time my values get into memory is if they're specifically requested from a query. Everything else is streamed.
It's a fully validating parser. Mine isn't, typically, although you *can* use it that way - it's just slower. simdjson probably tans its hide when it comes to validated parsing because I did nothing really to optimize it.
Real programmers use butterflies
modified 23-Dec-20 9:56am.
|
|
|
|
|
I haven't read your article, but this is what I aim for:
- what the code does
- the rationale behind its design
- overview of the classes involved
I avoid details of how the code works unless there are key points. Code pasted into the article may have details in its comments, but that's more accidental. I leave the details to the comments in the download, figuring that anyone who's really interested will look at the code itself. This limits the articles to a reasonable size even though many of them cover somewhat broad topics.
|
|
|
|
|
I try to cover the first two by the Introduction and then I flesh it out while adding the 3rd in "Conceptualizing this Mess" (my typical Background section)
In this case I went with a slightly different format than my usual and I also detailed major methods under different sections like "Navigation" and "In-Memory Trees"
I think that may put people off though as it's a lot to scroll through before the "Coding this Mess" section where I take the abstract stuff and make it concrete.
For me I like to put comments in the code in my articles because it's easier to "fisk" my own code line by line with comments, rather than directing the reader to the comments in the paragraph below. There I like to summarize what I did in the code. I find people seem to receive that well. But YMMV.
Real programmers use butterflies
|
|
|
|
|
To be honest... I haven't read all of your items, but the ones I read I liked it.
I don't really care about the length of an article, as long as the text is not unnecessarily repeating things and is not telling bullsh1t.
In the written communication we miss every every other non-verbal aspect that is so important when talking face to face (or video conference these days). So if the text is telling a story, explaining things in a proper way and making it a light read... I will never complain about the length.
About the usability and usefulness... I don't think I will use many of the things you post about, but it doesn't mean I don't appretiate your work.
Just an advice... although I know it is important for you, try to be a bit more careless about the stats of your posts. You life will be more relaxed and you won't stress yourself or get disappointed before needed so fast.
M.D.V.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
Rating helpful answers is nice, but saying thanks can be even nicer.
|
|
|
|
|
@JSOP PIEBALconsult, if I remember correctly, you were saying you were running into a performance bottleneck with your bulk JSON processor?
I don't know if you can run C++ binaries on your server, but there's some source code I'd love for you to try, and it may help you speed up your uploads significantly. Feel free to fork it and take ownership. If licensing is an issue email me, as I'm flexible and may consider making it PD. You'll have to add code to connect to the DB in C++ though.
Diet JSON and a Coke: An exploration of incredibly efficient JSON processing[^]
This was originally ported from C# and then improved, but there's a good possibility I'll wind up porting it back to C# as well.
Real programmers use butterflies
modified 23-Dec-20 8:49am.
|
|
|
|
|
I don't have any current processes that do bulk json processing. Are you perhaps thinking of someone else?
".45 ACP - because shooting twice is just silly" - JSOP, 2010 ----- You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010 ----- When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013
|
|
|
|
|
I suppose so. I thought it was you doing bulk JSON uploads but I guess not. Sorry for the churn.
Real programmers use butterflies
|
|
|
|
|
No prob
".45 ACP - because shooting twice is just silly" - JSOP, 2010 ----- You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010 ----- When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013
|
|
|
|
|
I believe that might have been @marc-clifton
<edit >sorry Marc, remembered wrong </edit>
Wrong is evil and must be defeated. - Jeff Ello
Never stop dreaming - Freddie Kruger
|
|
|
|
|
Might have been. Whoever was said they couldn't run 3rd party code like Newtonsoft on their server.
Real programmers use butterflies
|
|
|
|
|
|
That's who it was. Now I remember! I don't know why I was thinking JSOP other than they both seem similarly gruff to me.
Real programmers use butterflies
|
|
|
|
|
here[^]
Wrong is evil and must be defeated. - Jeff Ello
Never stop dreaming - Freddie Kruger
|
|
|
|
|
Thank you. What a thread sleuth. I couldn't remember which one it was under.
Real programmers use butterflies
|
|
|
|
|
I was involved otherwise I wouldn't have remembered
Wrong is evil and must be defeated. - Jeff Ello
Never stop dreaming - Freddie Kruger
|
|
|
|
|
I was the OP and I didn't remember.
To be honest, my memory is garbage. My RAM is bad.
I'm amazed I can code with how rickety it is.
Real programmers use butterflies
|
|
|
|
|
For having such bad RAM you're amazingly productive.
Wrong is evil and must be defeated. - Jeff Ello
Never stop dreaming - Freddie Kruger
|
|
|
|
|
I can't deploy DLLs to the servers, so I basically can use only C# which I write myself. That and the ADO.net providers for Oracle and Teradata. Other than that, it has to be part of .net 4.6 -- though I hope we get at least 4.7 soon (as mentioned in another post).
I'm fine with my parser at this time, but I look forward to trying what Microsoft has once it's available to me -- it may prove faster, it may not, but at this time I have nothing against which to benchmark mine.
A sort of simplified diagram of the layers of my parser:
______________________________________________________________
| |
| Loop: |
| Get the next token (JSONitem). |
| |
| If the token is a value: |
| Unquote it and add it to the item on top of the stack. |
| |
| If the token is the start of an object: |
| Instantiate a new object. |
| Add it the current item on top of the stack. |
| Push it onto the stack. |
| |
| If the token is the start of an array: |
| Instantiate a new array. |
| Add it the current item on top of the stack. |
| Push it onto the stack. |
| |
| If the token is the end of an object: |
| Pop the current item off the stack. |
| If a filter has been specified for the object: |
| Apply the filter (remove content). |
| |
| If the token is the end of an array: |
| Pop the current item off the stack. |
| |
| Break the loop when the stack is empty |
| or if the end-of-file is reached. |
| |
| Return the tree of tokens which represent the value. |
| (Or NULL for end-of-file.) |
| |
| Note: |
| This does not check to ensure that an end-of- matches the |
| start-of- which is popped of the stack. |
| |
| Possibly, the filter could wait to be applied just before |
| the tree is returned. |
| |
|____________________________________________________________|
| |
| Get the next token (string). |
| |
| Peek the following (significant) character. |
| |
| Is the following character a COLON? |
| No : The token we just got is unnamed. |
| Yes: |
| The token we just read is the name of a value. |
| Discard the COLON. |
| Get the next token. |
| |
| Return the (named or unnamed) token as a JSONitem. |
| (Or NULL for end-of-file.) |
| |
|____________________________________________________________|
| |
| Read the next character from the file and classify it as |
| appropriate for the type of parse being performed: |
| normal, delimiter, etc. |
| |
| Is the character part of the current token? |
| No : Return the current token. |
| Yes: Add it to the current token (StringBuilder). |
| |
| Note: This handles QUOTEs and ESCAPEs, throws away |
| insignificant whitespace, and normalizes newlines. |
| |
| This part of the parser is not JSON-specific, I also use |
| it for CSV. |
| |
|============================================================|
| |
| .net, TextReader for input file |
| |
|============================================================|
|
|
|
|
|
ah, you use a stack. my pull parsers never have. it's a little faster not to, the only hangup is without a stack it's possible to do this '[ "foo":1 ] ' because of the fact that the : follows the field name.
It's the one area where the latest parser of mine is not quite compliant. It *will* error on that, just not as soon as it should.
Real programmers use butterflies
|
|
|
|