|
I have no knowledge of parser types or how parsers work but, whenever I have to get inputs (example: reading structs from a file) and I find an erroneous token/word/data, I put it in a list along with the position of the input where it was found, move forward one byte and repeat again until I get valid input again.
Basically:
1- test token/word/data
2- if valid goto 3 else goto 5
3- process
4- if there is more token/word/data move to next token/word/data and goto 1 else goto 7
5- if invalid put token/word/data in list with position
6- if there is more input move forward one byte (not one token/word/data) and goto 1 else goto 7
7- process invalid token/word/data list (here you can merge them into blocks of successive errors)
8- end program and report errors (if any)
Can't something like this work? The invalid token/word/data are just ignored.
|
|
|
|
|
It can sort of, and I considered keeping an extra stringbuilder around in the C# rendition but it's less workable to do so in SQL. It can still be done, it just makes the code a lot nastier.
It would actually be easier at the parser level to concatenate error tokens coming in off the lexer, and in fact that's probably what I'll do. I'm sick of this.
Real programmers use butterflies
|
|
|
|
|
error handling separates the men from the boys, pardon the expression . So many times I've heard, "but that will never fail" and ashamedly, I've thought the same myself. It's a lie from the pit of hell. It *will* fail, and it will bite your weekend in the butt if you don't handle it.
Defensive programming is part of the art.
In embedded systems, it gets even worse. Every single failure point you have to ask yourself, "how do I keep running?" It takes a completely different mindset and design approach. A guy I work with has schooled me to the point of embarrassment, because his approach is obviously better.
Charlie Gilley
<italic>Stuck in a dysfunctional matrix from which I must escape...
"Where liberty dwells, there is my country." B. Franklin, 1783
“They who can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.” BF, 1759
|
|
|
|
|
With IoT it has been robbing peter to pay paul such that I try to make my code fail, but fail nicely.
I don't have the cycles or the memory to do full error checking in many cases, so what you're left with is a kind of situation where if you can, you want the machine to reboot straight to the landing page after dumping a log. Obviously it shouldn't do that in production but in situations like this it's always a matter of "if someday comes where it does"
I can't harden an IoT device against network attacks for example because i don't have the cycles to do overrun checks and well-formedness checks even on my HTTP headers for example. I parse just enough to make it work, which means a malformed header could easily be eaten by my device.
Such is the lay of the land, and it *is* a different error handling ballgame. It always makes me feel a little dirty.
That's not to say it's like industrial embedded where the opposite is true, and despite working on hardware that is "just good enough" it has to be solid rather than fancy. With IoT, the reverse is typically true.
Real programmers use butterflies
|
|
|
|
|
honey the codewitch wrote: The thing is it seems so bloody simple, but every similarly simple approach I've taken with it has fallen flat on its face.
I feel this to my core. Off and on for awhile I've been working on a project that's the same way. It seems so simple and yet every approach runs into a tiny little problem that completely invalidates the entire approach. I think I may have finally cracked the code last night though. Either that or I'll have another design that was infinitesimally close but fell on its face at the last hurdle.
I'm sure if you keep at it you'll find your answer
|
|
|
|
|
I think I found my solution. I hope you find yours.
Whenever I run into a situation like that - where it feels like trying to push the air bubbles out of a waterbed - I strongly consider the possibility that what I'm doing is an "anti-pattern" and look to replan my approach. That helps some.
Good luck.
Real programmers use butterflies
|
|
|
|
|
I work on automation systems and error handling is immensely important with them. Especially where I am right now, bringing up a new machine for the first time with a whole bunch of new software. It's been quite a battle and we are winning the war.
"They have a consciousness, they have a life, they have a soul! Damn you! Let the rabbits wear glasses! Save our brothers! Can I get an amen?"
|
|
|
|
|
Absolutely. That's a whole different ball of wax. Even the software development takes on more of a flavor of hardware engineering in terms of the rigor involved.
It's not my favorite. I prefer to use tools to produce as much of that code as possible, to shrink my test matrix.
Real programmers use butterflies
|
|
|
|
|
|
Two things:
1) Read the stuff at the top of the page: the Lounge is not for coding questions. Post it here instead: Ask a Question[^]
Ignoring the rules and annoying people you want free help from is not a good idea ...
2) While we are more than willing to help those that are stuck, that doesn't mean that we are here to do it all for you! We can't do all the work, you are either getting paid for this, or it's part of your grades and it wouldn't be at all fair for us to do it all for you.
So we need you to do the work, and we will help you when you get stuck. That doesn't mean we will give you a step by step solution you can hand in!
Start by explaining where you are at the moment, and what the next step in the process is. Then tell us what you have tried to get that next step working, and what happened when you did.
Just posting your homework and expecting us to give you code you can hand in as your own isn't going to work.
If you are having problems getting started at all, then this may help: How to Write Code to Solve a Problem, A Beginner's Guide[^]
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
When I saw this post earlier my instant thought was 'Ah - a homework question', and got the popcorn ready for a flame-throwing session - but then you come back all polite and reasonable, spoiling all the fun!
|
|
|
|
|
He has many of those as a copy+paste solution, just to keep it polite and reasonable by copy+pasting it
If he had to write it down everytime... it would sound different
M.D.V.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
Rating helpful answers is nice, but saying thanks can be even nicer.
|
|
|
|
|
That's why I use boilerplate ... I try to be polite, but sometimes the vitriol is strong ...
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
OriginalGriff wrote: the vitriol is strong ...
... in this one.
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|
|
I do the same thing with my code, for similar reasons.
Real programmers use butterflies
|
|
|
|
|
Best part is, it really catches them off-guard, and if they want to follow up with a legitimate question, they actually have to put the work in. Win-win.
[Edit]
Wow. This got downvoted?
Someone took offense.
modified 2-Nov-21 16:46pm.
|
|
|
|
|
You are soooo polite!
Get me coffee and no one gets hurt!
|
|
|
|
|
Unfortunately, if you read some of the posts first, and not the "message", you get the impression it may be programming (also). Maybe the message should add "unless you're a fixture".
It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it.
― Confucian Analects: Rules of Confucius about his food
|
|
|
|
|
That sounds very hard. I'm not sure it's possibile at all unless you use AI, an array of the fastest computers on Earth AND power the whole solution with a private nuclear plant...
Anything that is unrelated to elephants is irrelephant Anonymous
- The problem with quotes on the internet is that you can never tell if they're genuine Winston Churchill, 1944
- Never argue with a fool. Onlookers may not be able to tell the difference. Mark Twain
|
|
|
|
|
Tsk, tsk!
This is the modern age: you have to use Green Energy from renewable sources or your ex-mates will ritually disembowel your iPhone*!
* Manufactured in a Chinese factory powered by coal plants using slave labour, from non-recyclable plastics, and shipped to you in diesel powered cargo ships and bought at an inflated price to replace the one the manufacturer slowed down "to save your battery life" just before the new one launched.
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
|
If you need help with that and you're an adult, then you need to rethink your career.
If you're not an adult, then yes, that's where you start. Show us how far you got with the assignment.
Bastard Programmer from Hell
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
Holy state machines batman!
The SQL code to make this work is ridic
It took me hours including relearning some lesser used SQL commands. No weirdness like .NET in the DB
Calling this
EXEC [dbo].[Example_MatchCommentBlock] @value= N'foo /* test //* bar /* baz *//* */ fubar'
Yields this resultset (positions are 0 based, unlike most DB string pos functions)
Position Value Length
5 /* test //* bar /* baz */ 27
32 /* */ 5
It works using SELECT s over state machine tables. I'm about to make the tokenizer rendition, and then finally, the ones that don't use tables, but are just SQL procs.
Hooaah! *cracks knuckles*
Triple tier validation generation coming soon. wowza. I'm on fire today.
Real programmers use butterflies
|
|
|
|
|
Are you going to post the SQL so we can see how you did it?
|
|
|
|
|
Keep in mind this was generated by a tool, so the table names would be different for a different input specification file. The trick with this routine is filling the state tables properly
The reason it's so nasty is fetching the next UTF32 codepoint in SQL is a pain in my backside
CREATE PROCEDURE [dbo].[Example_Match] @value NVARCHAR(MAX), @symbolId INT
AS
BEGIN
DECLARE @valueEnd INT = DATALENGTH(@value)/2+1
DECLARE @index INT = 1
DECLARE @ch BIGINT
DECLARE @ch1 NCHAR
DECLARE @ch2 NCHAR
DECLARE @tch BIGINT
DECLARE @state INT = 0
DECLARE @toState INT = -1
DECLARE @accept INT = -1
DECLARE @position BIGINT = 0
DECLARE @capture NVARCHAR(MAX)
DECLARE @blockEndId INT
DECLARE @result INT = 0
DECLARE @len INT = 0
DECLARE @done INT = 0
CREATE TABLE #Results (
[Position] BIGINT NOT NULL,
[Value] NVARCHAR(MAX) NOT NULL,
[Length] INT NOT NULL
)
IF @index >= @valueEnd
BEGIN
SET @ch = -1
END
ELSE
BEGIN
SET @ch1 = SUBSTRING(@value,@index,1)
SET @ch = UNICODE(@ch1)
SET @tch = @ch - 0xd800
IF @tch < 0 SET @tch = @tch + 2147483648
IF @tch < 2048
BEGIN
SET @ch = @ch * 1024
SET @index = @index + 1
IF @index >= @valueEnd RETURN -1
SET @ch2 = SUBSTRING(@value,@index,1);
SET @ch = @ch + UNICODE(@ch2) - 0x35fdc00
END
END
WHILE @ch <> -1
BEGIN
SET @capture = N''
SET @position = @index - 1
SET @done = 0
WHILE @done = 0
BEGIN
SET @done = 1
SET @toState = -1
SELECT @toState = [dbo].[ExampleStateTransition].[ToStateId] FROM [dbo].[ExampleState] INNER JOIN [dbo].[ExampleStateTransition] ON [dbo].[ExampleState].[StateId]=[dbo].[ExampleStateTransition].[StateId] AND [dbo].[ExampleState].[SymbolId]=[dbo].[ExampleStateTransition].[SymbolId] AND [dbo].[ExampleStateTransition].[BlockEndId]=-1 WHERE [dbo].[ExampleState].[SymbolId]=@symbolId AND [dbo].[ExampleState].[StateId]=@state AND [dbo].[ExampleState].[BlockEndId] = -1 AND @ch BETWEEN [dbo].[ExampleStateTransition].[Min] AND [dbo].[ExampleStateTransition].[Max]
IF @toState <> -1
BEGIN
SET @done = 0
SET @state = @toState;
SET @capture = @capture + @ch1
IF @tch < 2048 SET @capture = @capture + @ch2
SET @index = @index + 1
IF @index >= @valueEnd
BEGIN
SET @ch = -1
SET @done = 1
END
ELSE
BEGIN
SET @ch1 = SUBSTRING(@value,@index,1)
SET @ch = UNICODE(@ch1)
SET @tch = @ch - 0xd800
IF @tch < 0 SET @tch = @tch + 2147483648
IF @tch < 2048
BEGIN
SET @ch = @ch * 1024
SET @index = @index + 1
IF @index >= @valueEnd RETURN -1
SET @ch2 = SUBSTRING(@value,@index,1);
SET @ch = @ch + UNICODE(@ch2) - 0x35fdc00
END
END
END
END
SET @accept = -1
SELECT @accept = [dbo].[ExampleState].[SymbolId] FROM [dbo].[ExampleState] WHERE [dbo].[ExampleState].[SymbolId] = @symbolId AND [dbo].[ExampleState].[StateId] = @state AND [dbo].[ExampleState].[BlockEndId] = -1 AND [dbo].[ExampleState].[Accepts]=1
IF @accept <> -1
BEGIN
SELECT TOP 1 @blockEndId = [dbo].[ExampleState].[BlockEndId] FROM [dbo].[ExampleState] WHERE [dbo].[ExampleState].[SymbolId]=@symbolId AND [dbo].[ExampleState].[BlockEndId] <> -1
IF @blockEndId <> -1
BEGIN
SET @result = 0
SET @state = 0
WHILE @ch <> -1
BEGIN
SET @done = 0
WHILE @done = 0
BEGIN
SET @done = 1
SET @toState = -1
SELECT @toState = [dbo].[ExampleStateTransition].[ToStateId] FROM [dbo].[ExampleState] INNER JOIN [dbo].[ExampleStateTransition] ON [dbo].[ExampleState].[StateId]=[dbo].[ExampleStateTransition].[StateId] AND [dbo].[ExampleState].[SymbolId]=[dbo].[ExampleStateTransition].[SymbolId] AND [dbo].[ExampleStateTransition].[BlockEndId]=@blockEndId WHERE [dbo].[ExampleState].[SymbolId]=@symbolId AND [dbo].[ExampleState].[StateId]=@state AND [dbo].[ExampleState].[BlockEndId] = @blockEndId AND @ch BETWEEN [dbo].[ExampleStateTransition].[Min] AND [dbo].[ExampleStateTransition].[Max]
IF @toState <> -1
BEGIN
SET @done = 0
SET @state = @toState
SET @capture = @capture + @ch1
IF @tch < 2048 SET @capture = @capture + @ch2
SET @index = @index + 1
IF @index >= @valueEnd
BEGIN
SET @ch = -1
SET @done = 1
END
ELSE
BEGIN
SET @ch1 = SUBSTRING(@value,@index,1)
SET @ch = UNICODE(@ch1)
SET @tch = @ch - 0xd800
IF @tch < 0 SET @tch = @tch + 2147483648
IF @tch < 2048
BEGIN
SET @ch = @ch * 1024
SET @index = @index + 1
IF @index >= @valueEnd RETURN -1
SET @ch2 = SUBSTRING(@value,@index,1);
SET @ch = @ch + UNICODE(@ch2) - 0x35fdc00
END
END
END
END
SET @accept = -1
SELECT @accept = [dbo].[ExampleState].[SymbolId] FROM [dbo].[ExampleState] WHERE [dbo].[ExampleState].[SymbolId] = @symbolId AND [dbo].[ExampleState].[StateId] = @state AND [dbo].[ExampleState].[BlockEndId] = @blockEndId AND [dbo].[ExampleState].[Accepts]=1
IF @accept <> -1
BEGIN
INSERT INTO #Results SELECT @position AS [Position], @capture AS [Value], DATALENGTH(@capture)/2 as [Length]
SET @state = 0
BREAK
END
ELSE
BEGIN
SET @capture = @capture + @ch1
IF @tch < 2048 SET @capture = @capture + @ch2
SET @index = @index + 1
IF @index >= @valueEnd
BEGIN
SET @ch = -1
SET @done = 1
END
ELSE
BEGIN
SET @ch1 = SUBSTRING(@value,@index,1)
SET @ch = UNICODE(@ch1)
SET @tch = @ch - 0xd800
IF @tch < 0 SET @tch = @tch + 2147483648
IF @tch < 2048
BEGIN
SET @ch = @ch * 1024
SET @index = @index + 1
IF @index >= @valueEnd RETURN -1
SET @ch2 = SUBSTRING(@value,@index,1);
SET @ch = @ch + UNICODE(@ch2) - 0x35fdc00
END
END
END
SET @state = 0
END
SET @state = 0
CONTINUE
END
ELSE
BEGIN
SET @len = DATALENGTH(@capture)/2
IF(@len>0) INSERT INTO #Results SELECT @position AS [Position], @capture AS [Value], @len as [Length]
END
END
SET @index = @index + 1
IF @index >= @valueEnd
BEGIN
SET @ch = -1
SET @done = 1
END
ELSE
BEGIN
SET @ch1 = SUBSTRING(@value,@index,1)
SET @ch = UNICODE(@ch1)
SET @tch = @ch - 0xd800
IF @tch < 0 SET @tch = @tch + 2147483648
IF @tch < 2048
BEGIN
SET @ch = @ch * 1024
SET @index = @index + 1
IF @index >= @valueEnd RETURN -1
SET @ch2 = SUBSTRING(@value,@index,1);
SET @ch = @ch + UNICODE(@ch2) - 0x35fdc00
END
END
END
SELECT * FROM #Results
DROP TABLE #Results
END
Real programmers use butterflies
|
|
|
|
|