|
He tried to re-define your problem so that it can be easily translated to regular expressions:
Quote: [beginning of string][whitespace] OR [whitespace](two or more) OR [whitespace][end of string] He wanted you to write it out yourself. However, I will guide you to the solution. Using Peter's suggestion, you will get:
^\s OR \s{2,} OR \s$
In Perl, this may look like:
if ($str =~ /^\s/ || $str =~ /\s{2,}/ || $str =~ /\s$/) {
print "String '$str' contains invalid white space sequence(s).";
}
|
|
|
|
|
I have a path and I want to get the part of a path before my match:
string s1= @"c:\temp\lev1\lev2\lev3\MYWORD\lev5\lev6";
string mtch ="MYWORD"
knowing
s1 and
mtch I want to get the part of path before mtch, so @"c:\temp\lev1\lev2\lev3\".
I tried with RegEx but I don't know how to be elegant.. I'm only able to find MYWORD position and then use the substring.
Is there a smarter solution?
|
|
|
|
|
Regex seems overkill for this, but you can do it with a zero-width positive lookahead assertion[^]:
string s1 = @"c:\temp\lev1\lev2\lev3\MYWORD\lev5\lev6";
string mtch = "MYWORD";
string pattern = "^.*(?=" + Regex.Escape(mtch) + ")";
Console.WriteLine(Regex.Match(s1, pattern).Value);
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
|
Also look at the various overloads of Regex.Split() , especially if the initial "match" is more than just a simple string.
|
|
|
|
|
I have a string like:
string s1 = "dog().Cat(\"Happy\")";
I want to:
1) count the number of round brackets '(', I expect as result 2
2) get the text inside, I expect "\"Happy"\", which appear in console output as "Happy"
No idea for 1. I try the following for 2
string k = Regex.Match(s1,@"\((\w+)\)").Groups[1].Value;
But It fails in understanding the \" character
Any Idea?
|
|
|
|
|
You're nearly there:
string k = Regex.Match(s1, @"\(""(\w+)""\)").Groups[1].Value;
For #1, unless you need to ensure that the brackets are balanced, Regex is overkill. Add using System.Linq; to your file, and try:
int count = s1.Count(c => c == '(');
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Thanks you get the point.
|
|
|
|
|
Hi,
I'm going nuts. I've got heaps of html-pages, that have to be adjusted, since there are "a href" tags within "pre" "/pre" code, that have to be removed. Some code like this:
<pre class="brush:xml;toolbar:false;gutter:true;"><constructiongroups xmlns:xsd="http://www.w3.org/2001/XMLSchema-instance"<br>
xsd:noNamespaceSchemaLocation="svgdescription.xsd"><br>
<constructiongroup><br>
<fza>5</fza><br>
<<a title="HST" href="1802.htm#o2441">hst</a>>10</hst><br>
<<a title="HT" href="1802.htm#o2442">ht</a>>30</ht><br>
In the meantime I found espresso to work with, but it doesn't satisfy my needs. I found the following regex:
(?:<pre .*\b>)*(?:<a[^>].*href="[0-9]+\.htm\#o[0-9]+"[^>]*>)(?:</a>)(?:</pre)
But it's not the end of the flagpole, but I don't know how to complete it. At the moment the regex is working partialy, but not as a whole.
Can someone help me, PLEASE?
Cheerio,
Heike
modified 2-Oct-12 6:52am.
|
|
|
|
|
Could you please close the <a>? It's messed up your post.
|
|
|
|
|
Done.
|
|
|
|
|
You don't seem to be allowing any content between the <a> and </a>. I think you need (.*) in there (it should be a replacement group because it's what you want to keep), and the matching to be done non-greedily.
Also, you don't allow for multiple <a> inside the same <pre>.
I don't really think regex is the right tool for this job, as there's lots of context and state-change in what you're trying to do. An XML based solution seems better to me. But something like
<pre[^>]*>(.*)(?:<a[^>]*>(.*)</a>(.*))*</pre>
... in non-greedy mode and writing all the found groups into the output might work.
|
|
|
|
|
Hi,
thanks for you answer. The content between "a" and "/a" has to stay, so my variant might be ok on this issue. But I never get the complete content of the "pre" tag. This is quite a problem.
What do you mean with "xml based solution"? How might xml help me on changing heaps of html-pages?
|
|
|
|
|
Sometimes it makes sense to split the problem up. Rather than attempting to grab the pre tags that contain a tags, why not grab the pre tags and then iterate over them - and do a second regex for the a tag. That way, you have the full context without worrying about backtracking.
|
|
|
|
|
My friend is making windows form and he needs to vlidate a username which will be a minimum of 4 characters and maximum of 15 characters long. It will also allow hyphens and underscores as well as dots in the middle, but not at the start and neither at the end of the username. There may be no more than one hyphen, one underscore and one dot in a row
Examples of disallowed usernames:.
-aquib
_aquib
.aquibxyz
aquib.
aquibxyz--qureshi
aquib__xyzqureshi
aquibqureshi-
aquib_qureshi-qureshi
aquib..qureshi
1236584 // not allow only numbers in username
aquib_ // means no symbols will be there at end
The username should not be only digits it should be either a mix of digits and alphabetical characters or it should be only alphabetic.
I hope this will be understood
I have got this regex:
^([a-zA-Z0-9](?(?!__|--)[a-zA-Z0-9_\-]){0,4}[a-zA-Z0-9])$
which is not usefull enough
|
|
|
|
|
|
yup. Thankx but can you simply provide me with such a regex which will take only numbers and alphabets only. (no special chrecters and dot and hifens and underscore all symbols are not allowed)
Minimum 3 and maximum 15 characters length must be and
and cannot take only numbers it should consist of only alphabets and numbers or only alphabets
( a username should not b only numbers it should be alphanumeric)
and i have downloaded expresso its too difficult to understand. I am a beginer. Of c#
sorry i must have frustated many coders
: (
|
|
|
|
|
If you take a look at The 30 Minute Regex Tutorial[^] here on CodeProject, you can work it out fairly easily to be something like:
\b\w{3,15}\b
You may need to modify the above for your specific requirements but it should get you started.
One of these days I'm going to think of a really clever signature.
|
|
|
|
|
OK, I've now worked out how to do it almost perfectly. For explanation purposes I've split the expression into three lines, but if you will use it it will have to written in one line like below:
(?!^[a-zA-Z0-9\.\-_]{1,2}$)(?!^[a-zA-Z0-9\.\-_]{16,}$)(?!^[0-9]{3,15}$)^(?:[a-zA-Z0-9](?:\.|_|-)?)+[a-zA-Z0-9]$
Explanation:
There is a grouping operator for regular expressions that is called zero-width negative lookahead (?! < subexpression >). What does that mean? Let's dissect this monster description:
- Zero-width lookahead
- This essentially means that this matching expression will not consume any characters. So when this expression has matched any following non zero.width expression will start of at the same place in the string as the zero-width regular expression did.
- Negative
- This signifies that all following expressions can only match, if this expression did not produce any matches.
(?!^[a-zA-Z0-9\.\-_]{1,2}$) // This expression will assert that a username with characters a-z A-Z 0-9 . _ and - are not less than three characters long
(?!^[a-zA-Z0-9\.\-_]{16,}$) // This expression will assert that a username with characters a-z A-Z 0-9 . _ and - are no more than 15 characters long
(?!^[0-9]{3,15}$) // This expression will assert that a username will not be made up of only digits [0-9]
^(?:[a-zA-Z0-9](?:\.|_|-)?)+[a-zA-Z0-9]$
Given the three assertions at the beginning of the regular expression we can now be sure that there are at least 3 characters and no more than 15 and that a username of only digits is also disallowed. The ^ and $ in the assertions are important here.
Now lets have a look at the final part which is also broken into several lines to better annotate it:
^
(?:[a-zA-Z0-9]
(?:\.|_|-)?
)+
[a-zA-Z0-9]
$
I do hope you could follow my explanations and found this solution helpful. The only drop of bitternes to me is that we now have an additiona constraint:
While it is true that the each of the characters '.', '-' and '_' may appear no more than once in row (valid constraint from OP), but now none of the charactes in the class [\.\-_] can appear one after another (new constraint introduced through solution).
Regards,
— Manfred
"I had the right to remain silent, but I didn't have the ability!"
Ron White, Comedian
|
|
|
|
|
it does take
shariq_k.-kha, how to avoid that. I mean allowing only one dot dash or underscore in a row?
|
|
|
|
|
You must be doing something wrong. It works exactly like you want it.
"I had the right to remain silent, but I didn't have the ability!"
Ron White, Comedian
|
|
|
|
|
but bro i told about not more that one symbols. means not more that one hifen or not more than one underscore or not more than dot. and not the combination should not be more than once means
shariq_khan.k
do you have checked or misunderstood?
|
|
|
|
|
Above you said: "I mean allowing only one dot dash or underscore in a row?".
The meaning of "in a row" is that the characters consecutively follow one another with out other characters in between.
If you meant that there should be at most one "-" and at most one "." and at most one "_", you should have said so.
I don't think this is possible to achieve using regular expressions. You'd have to iterate over the characters and increment a counter for each of the three characters "-", "." and "_". As soon a one of the counters is greater than one the check already failed and you don't need to use the regulare expression.
Regards,
Manfred
"I had the right to remain silent, but I didn't have the ability!"
Ron White, Comedian
|
|
|
|
|
ohkay.bro thanks.
and how can i contact you here on code project?
is there any message system here?
|
|
|
|
|
You can leave me a message on my messaged board on my profile page.
Cheers!
"I had the right to remain silent, but I didn't have the ability!"
Ron White, Comedian
|
|
|
|