Click here to Skip to main content
15,886,030 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
Can I use a regex to validate a command with optional arguments that may appear in any order eg: if an func has 6 possible arguments named arg1 - arg6

"func arg4 arg6 arg2" or just "func" should be valid, while "func arg1 arg7 arg2" is not.

I can't use split, as it is possible that one or more of the arguments actually contain spaces.

Cheers
Posted
Comments
Garth J Lancaster 4-Dec-15 20:09pm    
I had some thoughts about this but Im a bit confused about your 'named arguments' - do you mean that in reality you would have

func arg1 thisisarg1svalue arg2 thisisarg2s value

(note the space in the value for arg2)

I would be more inclined to use -arg[1..6] ie -arg1 -arg2 ... and specify that argument values with spaces in them must be enclosed in "" else strange things will happen (tm)

that means

func -arg1 thisisarg1svalue -arg2 "thisisarg2s value"

or is this not what you mean ?
Midi_Mick 4-Dec-15 20:20pm    
No - the arg1 to arg6 stand alone (i.e. not labels). so the command could be something like:
"process uppercase remove punctuation", where "uppercase" is one valid argument, and "remove punctuation" is another valid argument.
Garth J Lancaster 4-Dec-15 20:38pm    
I probably could have done it the 'label' way with a regex - I doubt if a regex would handle 'your way' unless you could code all the valid combination up front, ie

uppercase
remove punctuation

I'd be more inclined to gather all the args in an array and process them in a class that knows that 'remove' must be followed by 'punctuation'

sorry
Midi_Mick 4-Dec-15 21:07pm    
That's also my line of thinking...a pain, but so be it. It gets mre complicate insofar as uppercase could be lowercase, but not both, and remove could have different things (not just punctuation) associated with it. I figured though, if I could just get the "any order optional" bit sorted, I could have handled the rest.
Garth J Lancaster 4-Dec-15 22:20pm    
oh I get you entirely about pain - but, food for thought - build a class that's configurable maybe through adding 'verbs' and 'qualifier' rules/matches

for example, uppercase & lowercase are 'stand alone' verbs with no qualifiers, 'remove' is a verb that expects 'punctuation' or (trying to think of another one to fit) 'whitespace' as a qualifier

then you have

verb -> qualifiers
----------------------------------
uppercase
lowercase
remove punctuation | whitespace

and you could specify the verbs and qualifiers using a DSL/Fluent approach

OptionsValidator.validateRules.Add("uppercase")
OptionsValidator.validateRules.Add("lowercase")
OptionsValidator.validateRules.Add("remove").WithQualifiers(or("punctuation"), ("whitespace"))

and when it comes time check if the input is valid

bool optionsAreValid = OptionsValidator.validate();

not the easy approach you first envisaged, but flexible, dynamic, cool

Possible, but not ideal. I'd want to see more examples and/or a railroad diagram.

Here's what I just experimented with: (?in)((?'cmd'process)|(?'case'((upper)|(lower))case)|(?'rem'remove\s+\S+)|(?'unknown'\S+)){1}


My favorite RegEx page: https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx[^]

I hope you've tried Expresso: http://www.ultrapico.com/Expresso.htm[^]

And here's a link to my RegexTester: RegexTester[^]





And I'll throw in a link to my command line parser while I'm at it: ParsedCommandLine[^]
 
Share this answer
 
I would strongly advise not to use Regex in parsing/evaluation of command line arguments. This is not exactly very suitable task for Regex, the solution will be overly complicated and almost unusable (you validated it, now what?) and, importantly, not really maintainable and not flexible. Alternatively, I can suggest my utility, or also another one I referenced in my article: Enumeration-based Command Line Utility[^].

—SA
 
Share this answer
 
Comments
Midi_Mick 4-Dec-15 21:01pm    
That would be cool if I wrote the command processor - which I didn't. I am just calling it, and I need to make sure that arguments entered are valid before I do.
Sergey Alexandrovich Kryukov 4-Dec-15 21:11pm    
What's the problem? Either write your own or use mine or some other one. "Valid before" is not the best approach. It's much better to have all values after validation. Look at the picture and see how the diagnostics of validity is generated in my library.

Anyway, my article explains it all, so are you going to accept my answer formally?

—SA
Dave Kreskowiak 4-Dec-15 23:48pm    
A proper command line processor will do the validation for you and report any errors back.
PIEBALDconsult 5-Dec-15 0:17am    
I disagree; that's a task for the application.
Sergey Alexandrovich Kryukov 5-Dec-15 0:53am    
You are both right. 1) It's the task for the application; 2) the application can embed more specific processor.
I hope I provided reasonable balance in my article: there is a universal layer, semantic-agnostic, but it still add a certain formatting style on top of the notion of command line parameters pre-parsed to an array by CLR, and a semantic layer, which defines meta-data of for the particular (semantic) command line; on top of the the user defines data, which conform to metadata.

Now, universal layer validate the conformity of data to meta-data. And the application layer decides what to do with parameters which don't fit metadata...

—SA
I've come up with what seems to be a simple, and extensible solution. I just create a list of much simpler regular expressions, one for each of the possible arguments. I then iterate through that list, and if I get a match for a particular argument, I remove the match from the target string. This ensures that each argument is only in the command string once, and it doesn't matter where in the string it is found (any order - sorted). At the end, if there is anything left in the command string, then it had an invalid argument.

So , for the simple case as mentioned in the comments above, I would create a list with:
C#
List<string> list = new List<string>() {
    @"$\s*process",
    @"\s+(upper|lower)case",
    @"\s+remove\s+(punctuation|whitespace|numbers)
};

And the code to validate is
C#
StringBuilder sb = new StringBuilder(command);
foreach (string regexp in list) {
    Match m = Regex.Match(sb.ToString(), regexp,
        RegexOptions.IgnoreCase|RegexOptions.ExplicitCapture);
    if (m.Success)
        sb.Remove(m.Index, m.Length);
}
return string.IsNullOrWhiteSpace(sb.ToString());
 
Share this answer
 
v2
Comments
PIEBALDconsult 5-Dec-15 9:41am    
Yes, and that's pretty much what mine does, and mine doesn't lose track of which order they're in.
I suggest removing the \s in your expressions though. And the $ probably shouldn't be at the start.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900