Click here to Skip to main content
15,893,508 members
Please Sign up or sign in to vote.
4.33/5 (3 votes)
See more:
I want to use Regex to parse an input string like this:

string input = FirstName=Mary Jane LastName = ' Smith ' MiddleName='No middle name' Sid="123456789"


Note: there is one space between LastName and its =.
I am using expression like this to parse the input string:
string matchString = (\s*FirstName\s*=(?<firstname>\s*([\w\'\" ]){1,})\s)
Regex regexMatch = new Regex(matchString, RegexOptions.IgnoreCase);
if(regexMatch.IsMatch(input))
{}
</firstname>

Each time I just want to get value for one parameter. This works if there is no space between the paramter name and the =.
right now for the first name I will get "Mary Jane LastName".

Any help will be highly appreciated.
Posted
Updated 12-Feb-11 6:15am
v2

Why always Regex? Maybe, more simple solutions is:

C#
string[] names = fullName.Split(
    new char[] { '.', ' ' },
    System.StringSplitOptions.RemoveEmptyEntries);


After that, provide processing depending on names.Length.

[EDIT]
I agree with criticism I got from Manfred. My idea was yet another, simpler "solution" in case the bigger problem is so ill-posed.

Here is how it looks. You receive one single string from the user; I called it fullName. You split it, but what you can do with the result? You would classify results in cases where there is middle name, only a middle initial, no middle name, etc.? This is all invalid in a little more general case, anyway.

I can tell you that the Western (mostly English language based schema) "First Name — Middle Name — Last Name" is only applicable within (maybe some, not even all) English-speaking countries, and not very robust even in those countries.

Applying this is international level is also used, but this is purely idiotic!

Did you know, that what called "first name" in absolute majority if earth population means "family name", not "given name"?! In Russian, "middle name" is always a patronymic; and the forms of all three parts of the name are gender-modified (also depend on plurality and grammatical case). In many cultures more than three names is used to conduct the named identity of a person, in other cultures, there no family name at all. The English naming schema does not work well even for Europe.

The whole idea of extrapolating this English naming schema to the whole world is often applied and is purely idiotic.

Regex will not help here as well (that's why I came up with simpler solution: "it all is so bad anyway…").
This issue cannot have satisfactory resolution based on any means beyond the different use-case design. The whole idea should be given up. (And don't tell me about "requirements"!)
The name provided by the user should either be used as a non-breaking entity or the classification of the component or such names should be provided by the users: a different field for each name component, with the selection of "classifier".

By the way, think about what the idea plays with. This is no better way to offense a person than messing up one's name. This is a matter of simple politeness.
[END EDIT]

—SA
 
Share this answer
 
v3
Comments
Abhinav S 13-Feb-11 0:22am    
5 for the simpler solution.
However, the user may still be looking for a regex solution (for learning purposes).
Sergey Alexandrovich Kryukov 13-Feb-11 0:24am    
Thank you.
This is a problem of OP, my mission was just an advice.
--SA
Manfred Rudolf Bihy 13-Feb-11 13:09pm    
You get only 3+.

I think you're trying to weasle out of this one. I agree with Abhinav on this. There's to much variablility in the string to plainly use String.Split. Besides that if you split on ' ' there will be parts separated that should belong togehter. Sorry SA!

Your answer needs some fixing. What do you think?
Sergey Alexandrovich Kryukov 13-Feb-11 13:48pm    
I would agree with your "3", but not in favor of Regex.
For a record, I did not even mean to provide better alternative, I meant "if it is dirty anyway, why not making it more simple?".
You know what? I'll update the answer a bit.
Thanks for your note.
--SA
Manfred Rudolf Bihy 13-Feb-11 13:58pm    
No problem SA. Still can't see any changes though!
You want to look into non-greedy matching using three to four groups (four if you also want to match the Sid part. This page here has a comprehensive overview: http://www.mikesdotnetting.com/Article/46/CSharp-Regular-Expressions-Cheat-Sheet[^]

Best Regards,
Manfred
 
Share this answer
 
Comments
gmf 12-Feb-11 12:54pm    
Hi Manfred,

Thanks for the link. I read this link before, still can not figure it out.

I want to get all the information from the input string including Sid, but I only can do it one parameter at each time.

If I want to get all the parameters at one time, I works. The space bewteen the parameter name and = cause the problem. My pattern can not handle it.
Sergey Alexandrovich Kryukov 12-Feb-11 23:38pm    
Manfred, you might have mixed up your reference with something else: it's a cheat sheet, not answering a bit deeper question of OP. Even the short sentence of your own is deeper (but OP did not get it yet). It's needs a better document, like a manual.
--SA
Ali Al Omairi(Abu AlHassan) 13-Feb-11 17:09pm    
I am still in your side, Manfred. 5+
Manfred Rudolf Bihy 13-Feb-11 17:39pm    
Thanks for your support!
Ali Al Omairi(Abu AlHassan) 13-Feb-11 17:53pm    
Sir, I think SA soution is good. I used something like that in the past when i was developing some similar program (as gmf's). but i think using string.Split and just use no "Magic" is a stupid thing. so is there any way to exclude a complete word from a regular expression?
how about:

C#
string str1 = "Mary Jane LastName";
string str2 = str1.Substring(0, str.LastIndexOf(' ') + 1);

I believe this would result into "Mary Jane".

100 :rose: ;)
 
Share this answer
 
v2
Comments
gmf 13-Feb-11 11:35am    
Thanks for every one's input for my question.

I am developing a tool to allow users to do a few things and I used command design pattern.
Users can use my tool to add user, add product, add file, ect to the database. The use use command line to run the tool like this:

>AddUser FirstName="" LastName="" MiddleName=""
>AddFile FileLocation=""
>AddProduct Name="" Price=""
...

So I have comamnds such as AddUser, AddFile and AddProduct. I am looking for a very generic way to parse inputs for all the commnads, at the mean time also handle spaces, ' "".

So I can not use any string functions. This is what I am doing:

In each command, I have properties, those properties will be the user's input. For each example,
AddUser command has properties such as FirstName, MiddleName, and LastName.

When I get the user's input, first I need to figure out which command it is, then I found all the properties from the command, use the name of property to get its value from the input string. So I really did not hard coded any thing.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900