|
On IoT you don't have a lot of luxuries. You simply learn to do without.
Well, one area where I have to do without is HTML and XML well formedness checking and validation.
That might be an issue where data interchange is concerned, but not so much where rendering HTML or XHTML content is concerned.
What do you do on an error? You fail. You can either stop, or continue to render, possibly having some bad content displayed as a result, but this is still a better case than failing outright halfway through the parse because the document forgot a </b> . In fact, this is what commercial browsers do.
Here's the thing. If this is what you're doing, you don't need a DTD. You don't need an XSD Schema. You don't even need a heckin stack!
The result is much faster and lighter with a smaller binary footprint.
So why haven't I seen a pull reader with minimal validation/well formedness checking in the open source pool?
You'd think such a beast would be incredibly useful for building web browsers - even tiny ones - especially tiny ones!
*cracks knuckles*
I shouldn't have to be writing this. It's one of those things that leaves me wondering why it doesn't exist already.
Real programmers use butterflies
|
|
|
|
|
honey the codewitch wrote: I shouldn't have to be writing this. It's one of those things that leaves me wondering why it doesn't exist already.
It was a task waiting for you!
And look, here you are!
|
|
|
|
|
Why ? Because it is probably more complicated than it seems.
|
|
|
|
|
Remember who you are talking to. 90% of the stuff the witch does is over my head.
I’ve given up trying to be calm. However, I am open to feeling slightly less agitated.
|
|
|
|
|
edit/ Nevermind - wrong thread.
Yes, you are probably very right
|
|
|
|
|
It's not really. I'm almost done with it.
Real programmers use butterflies
|
|
|
|
|
Yes, I was merely talking about normal mortals like us, not wizards
|
|
|
|
|
That'd be a witch. Wizards are a different thing altogether.
Real programmers use butterflies
|
|
|
|
|
I preprocess / strip out all the HTML and insert my own markup directives. I don't need a "paragraph" keyword to tell me where a paragraph should start or end; etc. But as you say, it's assumed to be valid HTML in the first place (or made to be).
It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it.
― Confucian Analects: Rules of Confucius about his food
|
|
|
|
|
As far as preprocessing I don't have the memory or nvs storage to do that on my device, and it wouldn't really buy me anything much even if i did.
Much better in my scenario at least, to just read with a pull parser in a loop, get tag names back and set values on a context structure i use while rendering. The context structure has the current position and flags like "bold" or "italic", styles and font faces, that sort of thing.
I mean, if I knew at compile time what my HTML was going to be I'd just generate C++ code from it that renders it, and that sort of "preprocessing" would be a huge win, but in my scenario I have to read arbitrary HTML.
Real programmers use butterflies
|
|
|
|
|
honey the codewitch wrote: cracks knuckles Ye-aaaaah!
Rock on Witchy Poo! Looking forward to the article(s)
|
|
|
|
|
Thanks for the vote of confidence. I'm getting there but it's a bit of a bear.
For starters, everything has to stream, because you don't have a ton of RAM. 1kB is a big deal, so i let you specify as little as 128 bytes for a buffer. I can't stream attribute and element names, but I can stream attribute values and element content, N bytes at a time (depending on what you had set N to)
So if you have a long attribute value while you're doing while(reader.read()) { you'll get multiple reader.node_type()==ml_node_type::attribute_content results back before getting reader.node_type()==ml_node_type::attribute_end
Not only are there a zillion html entities like © (©) but I had to make a state machine to decode all of them efficiently off a unicode stream.
Also this:
<span class="foo">this is valid</span>
<span class='foo'>and this</span>
<input disabled><!--
<div class=you_thought_this_would_be easy id=but_no_because_html_hates_you_this_is_also_valid></div>
<br /><!--
So this is kinda rough sometimes, but I'm making progress.
Fortunately I don't have to care about custom entity references, namespace declarations, or even well formedness (balanced tags, etc) which makes some of it pretty easy.
Real programmers use butterflies
|
|
|
|
|
You bugger. The bit of my brain I need to use to make an intelligent response is entirely consumed with the act of killing myself laughing at your code-snippet. Fan-friggen-tastic
|
|
|
|
|
I've got it running. I even wrote most of the article, but I should probably test it more.
Real programmers use butterflies
|
|
|
|
|
Nice work..
|
|
|
|
|
|
Let me know when it's sharks, not mozzies ...
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
From what I understand that's a Tuesday in Wisconsin.
I’ve given up trying to be calm. However, I am open to feeling slightly less agitated.
|
|
|
|
|
It didn't sound like I thought it would..
|
|
|
|
|
Call me when it (literally) rains cats and dogs.
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|
|
Well - if you read the comments of cat owners, you're half way their. Just a small typo in your spelling:
For cats, it's reigns and has always been so.
Ravings en masse^ |
---|
"The difference between genius and stupidity is that genius has its limits." - Albert Einstein | "If you are searching for perfection in others, then you seek disappointment. If you seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010 |
|
|
|
|
|
I've been working on some software that I had planned to release open source under the MIT license.
It's an e-pub reader. The thing is as I was working on it, I ended up out of curiosity looking up how the original Nook e-reader did its magic.
Come to find out, as I understand it anyway, it's basically using a smartphone(ish) back like an ARM Cortex-A with 256MB of RAM end running a modified android OS.
My code *should* run on 512kB of RAM on a much lower end (read cheaper and less power hungry) CPU/SoC. 1GHz+ ARM Cortex-A vs my 240Mhz Tensilica Xtensa (although my software should run on other systems as well). I haven't finished the code yet so I can't be 100% certain but worst case I need a WROVER with an extra 4MB of PSRAM.
Well this presents me an opportunity according to some people I work with who do sales.
I could potentially approach a larger company that makes e-readers with this stuff because my software will run on as little as $30 retail worth of hardware (small e-paper screen included) and the battery can be smaller/last longer meaning the device can be significantly shrunk.
But if I'm to do that, I don't want to release my e-pub reader to the public and thus lose some of my negotiating leverage.
I've been thinking about releasing the component pieces. Indeed I already have by way of GFX and my Zip library - I'm working on a markup reader right now that I may release.
But the paranoid part of me is worried doing that may also decrease my leverage because someone else will put the pieces together into an e-pub reader.
What would you do?
Real programmers use butterflies
|
|
|
|
|
Try it... and then when you fail to make any significant money because whomever you approach tries to screw you out of everything ..make it open source!
- I would love to change the world, but they won’t give me the source code.
|
|
|
|
|
I intend to remain optimistic.
Real programmers use butterflies
|
|
|
|
|
Start downloading from your articles, and file off the serial numbers before I start to sell "my" new reader?
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|