|
I'll check out capture groups - thank you.
And thanks for the thought on Expresso but I'm on a Mac.
michael
|
|
|
|
|
Hello,
Using Visual Studio 2015, the first following line RegEx finds the second following line Dim statement:
Dim L_V_Scalar_(Integer)_Item As \1
Dim L_V_Scalar_Integer_Item As Integer
Is there a way to write a back reference with a ^ not operator to find the following violations of my naming convention?
Dim L_V_Scalar_Integer_Item As String
I tried the following regular expressions, but they do not work:
Dim L_V_Scalar_(Integer)_Item As ^\1
Dim L_V_Scalar_(Integer)_Item As [^\1]
Many thanks.
Keith
|
|
|
|
|
There is most likely a more elegant way to do this, but this expression should work.
[EDIT] It is the second group that should be back-referenced, hence \2
Dim\s+(?<variable_name>\w_\w_\w+_(?<expected_type>\w+)_\w+)\s+As\s+(?:(?<matched_type>\2)|(?<mismatched_type>\w+))
The line
Dim L_V_Scalar_Integer_Item As Integer
will give the result
expected_type matched_type mismatched_type
Integer Integer
The line
Dim L_V_Scalar_Integer_Item As String
will give the result
expected_type matched_type mismatched_type
Integer String
Example in c#
Match m = Regex.Match("Dim L_V_Scalar_Integer_Item As String",
"Dim\s+(?<variable_name>\w_\w_\w+_(?<expected_type>\w+)_\w+)\s+As\s+(?:(?<matched_type>\2)|(?<mismatched_type>\w+))");
if (m.Success)
{
if (String.IsNullOrEmpty(m.Groups["matched_type"].Value))
{
string errorText = String.Format(@"The type '{0}' is not allowed for variable name '{1}'.
The correct type is '{2}'",
m.Groups["mismatched_type"].Value,
m.Groups["variable_name"].Value,
m.Groups["expected_type"].Value);
}
}
else
{
}
modified 18-Apr-16 1:30am.
|
|
|
|
|
Hello George,
That's very helpful.
Is there a way to get a result running this interactively in Visual Studio rather than via code? I am basically trying to find non-conformances and fix them manually, one by one.
Many thanks.
Keith
|
|
|
|
|
I suppose you could use the Code Analysis part of Visual Studio and add your own rule set.
However, if you can edit the rules depends on the version of VS you have.
Quote: In Visual Studio Ultimate, Visual Studio Premium, and Visual Studio Professional, you can create and modify a custom rule set to meet specific project needs associated with code analysis
See Using Rule Sets to Group Code Analysis Rules[^]
Another way could be to create your own Custom Tool that run through your code and show you the violations in a dialog box or something.
This is includes quite a bit of work and some understanding of COM programming.
See Implementing Single-File Generators[^] for more information.
This Codeproject article might also be helpful: Writing a Single-File Generator[^]
|
|
|
|
|
Thanks for the ideas George. I have managed to achieve the desired result in the Visual Studio Find And Replace dialog, with the following regular expression:
_Scalar_+(Boolean|Byte|Char|Date|Decimal|Double|Integer|Long|Object|SByte|Short|Single|String|UInteger|ULong|User\-Defined|UShort)_[A-Za-z0-9]+ As (?!\1)\w+
However, I am still trying to fix the problem that I describe in the following link:
<a href="http://forums.devshed.com/regex-programming-147/nested-pattern-matching-visual-studio-2015-a-973751.html">http://forums.devshed.com/regex-programming-147/nested-pattern-matching-visual-studio-2015-a-973751.html</a>[<a href="http://forums.devshed.com/regex-programming-147/nested-pattern-matching-visual-studio-2015-a-973751.html" target="_blank" title="New Window">^</a>]
KR,
Keith
|
|
|
|
|
I think you got the answer already at that site.
The expression, as given in the answer,
Dim L_V_Scalar_([^_\s]+)_([^\s]+) As (?>(\w+\.)*)(?!\1)\w+
should work for you
|
|
|
|
|
Thank you George. Yes it worked.
|
|
|
|
|
i thought it would be simpler, however i wasnt able to get regular expression for the below.
i want to extract the words which are not in a particular pattern.
say in the below sentence
clicking the (Run Match) button (or F5) to see what (happens).
i want to extract all the words which are not defined in brackets (). so the output will be
clicking
the
button
to
see
what
below is expression which i defined. it is not working. can any one point out the mistake in the expression ?
(?!\(\w+\))
modified 1-Mar-15 0:33am.
|
|
|
|
|
You haven't mentioned which language you're using. Different languages have different implementations of Regular Expressions, which support different features.
The following should work in .NET:
(?<!\()\b\w+\b(?!\))
This uses zero-width negative assertions[^] to ensure that the word doesn't start with an opening parenthesis, or end with a closing parenthesis.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
How about if we just drop all the words in (parens) and display what's left?
<br />
#!/usr/bin/perl<br />
use strict;<br />
use warnings;<br />
<br />
my(@a,@b,$i,$j,$k,$s,$t);<br />
my(@out,$ins,$outs);<br />
<br />
$ins="clicking the (Run Match) button (or F5) to see what (happens). ";<br />
print "\n";<br />
$outs=$ins;<br />
$outs=~s/\(.+?\)
$outs=undupespace($outs);<br />
print "$outs\n";<br />
<br />
exit; # Exit main pgm.<br />
###################################################################<br />
sub undupespace<br />
# Remove dupe spaces. Max 1 consecutive space.<br />
{my($l)=@_;<br />
<br />
$l=~s/ {2,}/ /g;<br />
return $l; # undupespace<br />
<br />
}<br />
<br />
Output:
clicking the button to see what .
|
|
|
|
|
Hi,
Now I need a pattern to detect last name possibilities. I think this pattern will be slightly more complicated. Names that I see in the database are like:
Jones
Jones-Smith
Jones Smith (no hyphen)
O'Leary
Van Allen (no hyphen)
Vander Ark (no hyphen)
I think that this pattern will work but I would like public opinion to make sure I am getting this right:
^[a-zA-Z\-\s']+$
Can you think of any last names where this will not work? In testing it seems to work out alright.
Thanks,
Rob
|
|
|
|
|
Don't forget the characters that include diacritical marks.
E.g., ö Å ç
A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.
|
|
|
|
|
Is there a way to check for that without having to list every Unicode character? I didn't see any accented names in our database but that certainly doesn't mean it can't happen in the future.
I'd prefer to not include all Unicode characters. Just the ones with a high likelihood of showing up. I imagine that it could only be characters that would be accepted by Active Directory.
|
|
|
|
|
At least with the .NET Regex
http://msdn.microsoft.com/en-us/library/20bw873z(v=vs.110).aspx#CategoryOrBlock[^]
(I don't know about others)
you can specify the Unicode character category (for "Letter") so your regex would be:
^[\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Lm}\-\s']+$
possibly even just
^[\p{L}\-\s']+$
A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.
|
|
|
|
|
After looking at that link, a person could go crazy trying to catch every possibility. Looks like regex can be very thorough!
Thanks for the help!
|
|
|
|
|
Yes!!
There's a reason the "Mastering Regular Expressions" book[^] is 496 pages!!!
A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.
|
|
|
|
|
How would you allow for a period only at the end of the string where in the case a name ends in Jr. or Sr.? A period wouldn't normally appear in any other position in a last name. I'm going with the pattern below so far. I'm double checking names in Active Directory but I'm reasonably sure you can't use diacritical characters. I need to research that to be certain.
^[a-zA-Z\-\s']+$
|
|
|
|
|
^[a-zA-Z\-\s']+\.$
Add the \. right before the $
A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.
|
|
|
|
|
That works perfect. I'm really starting to get the hang of this.
|
|
|
|
|
Checkout the Expresso[^] tool (free) to explore regular expressions!
A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.
|
|
|
|
|
Right, that is actually the tool I'm using. I bumped into it a couple of years ago but this is the first time I ever used regex.
|
|
|
|
|
Well, this pattern was working yesterday on a different computer at work. I installed Expresso on my personal computer so I could work on my project over the weekend and now the pattern is not working.
^[a-zA-Z\-\s']+\.$
john1 = no matches
The pattern should match the number one because numbers are not allowed but the results are blank when I run this pattern. I could have sworn that this was working yesterday.
EDIT:
I did some further testing and discovered that the \. is breaking the pattern. If there is no period at the end; then count = 0. This pattern seems to require the period at the end and then it works correctly. The period should be allowed 0 or 1 times at the end of the string.
So the pattern below is working the way I want it to in Expresso but not when I use it in an HTA using vbscript to do the pattern matching. Vbscript is throwing an error at the line where the pattern is executed.
^[a-zA-Z\-\s']+?\.$
Not sure how to make a pattern that works in Expresso to also work with vbscript.
SOLUTION:
^[a-zA-Z\-\s']+?\.$ This pattern works when testing in Expresso but doesn't work with vbscript although this may work when used with other languages.
^[a-zA-Z\-\s']+\.{0,1}$ This is the pattern that behaves the same way as the pattern above but also works with vbscript.
MATCHES:
Jones
Jones-Smith
Jones Smith (no hyphen)
O'Leary
Van Allen (no hyphen)
Vander Ark (no hyphen)
Jones Sr.
Although this doesn't address diacritical characters, a few conversations with colleagues resulted in the decision that the risk is very low that they will be used in Active Directory. We currently have only 3 techs making entries into AD so informing them of how this pattern works will reduce the risk even further. I have worked for my organization for 14 years and no diacritical characters have been used until now so I feel pretty safe in not testing for them. It may not be the ultimate approach such as selling a product to the public but it does meet the needs of the specifications that were given to me.
Thank you! - I'd like to give a shout out to everyone who helped me out with this project! I really appreciate all of you taking the time to steer me in the right direction! I would go as far as to say that CodeProject could be just as valuable as sitting in any classroom. You may not get a certification here but the knowledge gained is invaluable. I was able to gain a solid understanding of regex in a matter of a few hours. I watched several videos but I would say this forum helped out the most because it specifically dealt with the solution that I was attempting to resolve.
modified 12-Oct-14 10:25am.
|
|
|
|
|
robwm1 wrote: ^[a-zA-Z\-\s']+?\.$
This was so.... close.
When I suggested the \. I forgot the conditional aspect of the the dot at the end. (Sorry.)
Just move the ? to be after the \.
^[a-zA-Z\-\s']+\.?$
the ? means exactly the same thing as {0,1}
A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.
|
|
|
|
|
I never thought to move the ? to the end. You're right though, it is the same result as {0,1}.
Thanks again!
|
|
|
|