Click here to Skip to main content
15,897,226 members

Welcome to the Lounge

   

For discussing anything related to a software developer's life but is not for programming questions. Got a programming question?

The Lounge is rated Safe For Work. If you're about to post something inappropriate for a shared office environment, then don't post it. No ads, no abuse, and no programming questions. Trolling, (political, climate, religious or whatever) will result in your account being removed.

 
GeneralRe: Eureka! regex matching with stored procedures Pin
honey the codewitch1-Nov-21 3:35
mvahoney the codewitch1-Nov-21 3:35 
GeneralRe: Eureka! regex matching with stored procedures Pin
Marc Clifton30-Oct-21 11:43
mvaMarc Clifton30-Oct-21 11:43 
GeneralRe: Eureka! regex matching with stored procedures Pin
honey the codewitch30-Oct-21 12:00
mvahoney the codewitch30-Oct-21 12:00 
GeneralRe: Eureka! regex matching with stored procedures Pin
Marc Clifton30-Oct-21 14:37
mvaMarc Clifton30-Oct-21 14:37 
GeneralRe: Eureka! regex matching with stored procedures Pin
PIEBALDconsult30-Oct-21 14:12
mvePIEBALDconsult30-Oct-21 14:12 
GeneralRe: Eureka! regex matching with stored procedures Pin
honey the codewitch30-Oct-21 14:23
mvahoney the codewitch30-Oct-21 14:23 
GeneralRe: Eureka! regex matching with stored procedures Pin
PIEBALDconsult30-Oct-21 15:26
mvePIEBALDconsult30-Oct-21 15:26 
GeneralRe: Eureka! regex matching with stored procedures Pin
honey the codewitch30-Oct-21 15:37
mvahoney the codewitch30-Oct-21 15:37 
Yeah, that's poor man's tokenizing. To be honest, .NET's regex engine is crap at it, just because they didn't do the minor work necessary to implement the feature (it's a slight "hack" or rather "twist" on a a|b|c such that each expression a,b, and c has a symbol id associated with it, and the engine will tell you which it matched. It goes through the text beginning to end, reporting all matches like that, one row in the table for each match..

Your input spec might look something like this (.rl format)
VerbatimStringLiteral= '@"([^"]|"")*"'
StringLiteral='"([^"]|\\.)*"'
CharacterLiteral= '[\']([^\']|\\.)([\'])'
IntegerLiteral= '(0x[0-9A-Fa-f]{1,16}|([0-9]+))([Uu][Ll]?|[Ll][Uu]?)?'
FloatLiteral= '(([0-9]+)(\.[0-9]+)?([Ee][+-]?[0-9]+)?[DdMmFf]?)|((\.[0-9]+)([Ee][+-]?[0-9]+)?[DdMmFf]?)'
// the following takes a long time to generate
//Keyword = 'abstract|as|base|bool|break|byte|case|catch|char|checked|class|const|continue|decimal|default|delegate|do|double|else|enum|event|explicit|extern|false|finally|fixed|float|for|foreach|goto|if|implicit|in|int|interface|internal|is|lock|long|namespace|new|null|object|operator|out|override|params|private|protected|public|readonly|ref|return|sbyte|sealed|short|sizeof|stackalloc|static|string|struct|switch|this|throw|true|try|typeof|uint|ulong|unchecked|unsafe|ushort|using|virtual|void|volatile|while'
Whitespace<hidden>='[\t\r\n\v\f ]+'
Identifier='[_[:IsLetter:]][_[:IsLetterOrDigit:]]*'
CommentBlock<id=40,blockEnd="*/">="/*"
//Bar="bar"


Forgive the word wrapping but it's a line based grammar.

So if you tokenize something something by calling Tokenize you get back a row for each match and what Symbol it was plus where it was in the document and its actual value (like CommentBlock at position 3, value "/* bar */") and you get many of those for a potential string.
Real programmers use butterflies

GeneralSunshine, something that simple could never work. Pin
Cp-Coder30-Oct-21 6:01
Cp-Coder30-Oct-21 6:01 
GeneralRe: Sunshine, something that simple could never work. Pin
Marc Clifton30-Oct-21 11:54
mvaMarc Clifton30-Oct-21 11:54 
GeneralRe: Sunshine, something that simple could never work. Pin
Cp-Coder31-Oct-21 2:26
Cp-Coder31-Oct-21 2:26 
GeneralRe: Sunshine, something that simple could never work. Pin
GuyThiebaut30-Oct-21 23:37
professionalGuyThiebaut30-Oct-21 23:37 
GeneralRe: Sunshine, something that simple could never work. Pin
Cp-Coder31-Oct-21 2:30
Cp-Coder31-Oct-21 2:30 
GeneralRe: Sunshine, something that simple could never work. Pin
GuyThiebaut31-Oct-21 2:43
professionalGuyThiebaut31-Oct-21 2:43 
RantIf you can't match UTF32 codepoints, you don't support unicode! Pin
honey the codewitch30-Oct-21 6:01
mvahoney the codewitch30-Oct-21 6:01 
GeneralRe: If you can't match UTF32 codepoints, you don't support unicode! Pin
User 1537592230-Oct-21 6:39
User 1537592230-Oct-21 6:39 
GeneralRe: If you can't match UTF32 codepoints, you don't support unicode! Pin
honey the codewitch30-Oct-21 6:59
mvahoney the codewitch30-Oct-21 6:59 
GeneralRe: If you can't match UTF32 codepoints, you don't support unicode! Pin
Dan Neely1-Nov-21 3:02
Dan Neely1-Nov-21 3:02 
RantAargs!!!!!! Pin
David O'Neil29-Oct-21 14:22
professionalDavid O'Neil29-Oct-21 14:22 
AnswerRe: Aargs!!!!!! Pin
Randor 29-Oct-21 15:18
professional Randor 29-Oct-21 15:18 
GeneralRe: Aargs!!!!!! Pin
David O'Neil29-Oct-21 15:31
professionalDavid O'Neil29-Oct-21 15:31 
GeneralRe: Aargs!!!!!! Pin
David O'Neil29-Oct-21 15:33
professionalDavid O'Neil29-Oct-21 15:33 
GeneralRe: Aargs!!!!!! Pin
CodeWraith29-Oct-21 17:50
CodeWraith29-Oct-21 17:50 
GeneralRe: Aargs!!!!!! Pin
Sander Rossel30-Oct-21 3:34
professionalSander Rossel30-Oct-21 3:34 
GeneralRe: Aargs!!!!!! Pin
David O'Neil30-Oct-21 4:44
professionalDavid O'Neil30-Oct-21 4:44 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.


Straw Poll

Were you affected by the geomagnetic storms this past weekend?
Communication disruptions, electrified pipes, random unexplained blue-screens in Windows - the list of effects is terrifying.
  Results   20 votes