|
Java/perl provide the following
\b
That represents a 'boundary' however you should read up on that to insure that is what you really want.
Might also keep in mind that Java/Perl are aggressive in that they look for the best match not the first match. That means that it will keep trying until it is sure. That can result in a lot of processing - sometimes leading to days or even infinite searches. Although your current formats should not do that.
Anchoring to anything will optimize the search.
|
|
|
|
|
Member 13555386 wrote: TFN includes a check digit for detecting erroneous numbers. ... Can it be done?
No.
Although if I was using Perl one can create a "regex" that call a method as part of the regular expression check itself.
But I still wouldn't suggest doing that.
There is no real standard for "regular expressions" so first you would need to define exactly what regular expression engine you are using.
If you are using perl or java then there is a boundary match which might or might not be appropriate for your actual content.
Member 13555386 wrote: (\d{8,9})|(\d\d\d[ ]\d\d\d[ ]\d\d\d)|(\d\d\d[-]\d\d\d[-]\d\d\d)
Following provides a single range of 8-9 digits and then a range of nine digits with spaces or dashes.
(\d{8,9})|(\d\d\d[- ]\d\d\d[- ]\d\d\d)
You could make one of those digits optional by adding a '?' after it. I do not know which one that should be.
Other suggestions really require knowing what regex engine you are using (rather than going through every possibility.)
|
|
|
|
|
See my response to George Jonsson above
|
|
|
|
|
Hi all,
I've been fighting with this for hours now, but don't get it to work. I want to strip out some information from an ini-file, which basically looks like this:
[CameraDefinition.1]
Title=Linke Seite
Guid={0ae3f864-da10-4e5a-977c-b9bba47d6f7a}
Description=Ansicht nach links
Origin=Center It's a standard Windows text file, the sections are separated with two new lines (\r\n\r\n). My regex currently looks like this: "\[CameraDefinition\.(?<camnumber>\d+)\][.|\s]*Guid=(?<guid>\{[0-9A-F\-]*\})"
While the first (CamNumber) and the last part (Guid) return correct results as 'partial match', the critical part seems to be the underlined expression for "everything between the top and the Guid", which might be several lines.
I'd be happy if someone of you helps me solve this... Thank you in advance!
Regards
Mick
|
|
|
|
|
Probably easier just to read it line by line and look for the keyword you are interested in.
|
|
|
|
|
Hi Richard, thank you for your response. It's exactly what I did meanwhile... still I hope someone knows an answer, so that this seemingly simple task wouldn't annoy me again
|
|
|
|
|
It's not clear exactly what you are trying to extract. But either way, my suggestion is much easier.
|
|
|
|
|
The format of entries in an ini file is Key=Value . That is, get the position of the first = sign, the substring on the left is the key, the substring on the right is the value.
|
|
|
|
|
Hello and thank you both!
In order to solve the regex issue, I successfully tried to read the first three lines of each block:
\[CameraDefinition\.(?<CamNumber>\d+)\]\r\nTitle\=(?<Title>.*)\r\nGuid\=(?<Guid>\{[0-9A-Fa-f\-]*\}) Considering the results, this works well, is fast and gives me the title as an additional field. In case I recognize any errors, I'll go back to reading line by line and also consider your hint regarding general ini-file structure.
Thank you again, and have a nice day!
EDIT: Changing the critical middle part to the following helped...
\[CameraDefinition\.(?<CamNumber>\d+)\]\r\n(.*\r\n)Guid\=(?<Guid>\{[0-9A-Fa-f\-]*\})\r\n{1,} This way I have the correct matches as well. But: Still no capture if there are more lines between the top line and the GUID definition.
modified 8-May-17 6:45am.
|
|
|
|
|
Sonhospa wrote: no capture if there are more lines between ... Exactly why a regex is not the best way to do this.
|
|
|
|
|
I'm using an application called Panorama X (see provue.com) which has a function, RegExReplace, which, in a block of text, replaces every occurrence of a string defined by a regular expression with a different string. For example,
RegExReplace(TextBlock,"^[;]","//")
will search a block of text called TextBlock for all lines beginning with a semi-colon and will replace the semi-colon with two slashes.
In some lines, the ";" is preceded by a number of spaces but I still want to change it. The only way I've been able to do this is to copy the characters before the ";" and do this:
RegExReplace(TextBlock,"^[ ]+[;]",<preceding spaces=""> +"//")
which is a little clumsy.
Is there a better way to do it? I'm very new to RegEx.
michael
|
|
|
|
|
Look in the documentation of your regex engine for "capture groups". You can use a capture group to grab any leading spaces, then reference it in your replacement string. That bit will probably be called "backreference".
If you are doing anything serious with regex, I can heartily recommend Expresso[^]
Cheers,
Peter
Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012
|
|
|
|
|
I'll check out capture groups - thank you.
And thanks for the thought on Expresso but I'm on a Mac.
michael
|
|
|
|
|
Hello,
Using Visual Studio 2015, the first following line RegEx finds the second following line Dim statement:
Dim L_V_Scalar_(Integer)_Item As \1
Dim L_V_Scalar_Integer_Item As Integer
Is there a way to write a back reference with a ^ not operator to find the following violations of my naming convention?
Dim L_V_Scalar_Integer_Item As String
I tried the following regular expressions, but they do not work:
Dim L_V_Scalar_(Integer)_Item As ^\1
Dim L_V_Scalar_(Integer)_Item As [^\1]
Many thanks.
Keith
|
|
|
|
|
There is most likely a more elegant way to do this, but this expression should work.
[EDIT] It is the second group that should be back-referenced, hence \2
Dim\s+(?<variable_name>\w_\w_\w+_(?<expected_type>\w+)_\w+)\s+As\s+(?:(?<matched_type>\2)|(?<mismatched_type>\w+))
The line
Dim L_V_Scalar_Integer_Item As Integer
will give the result
expected_type matched_type mismatched_type
Integer Integer
The line
Dim L_V_Scalar_Integer_Item As String
will give the result
expected_type matched_type mismatched_type
Integer String
Example in c#
Match m = Regex.Match("Dim L_V_Scalar_Integer_Item As String",
"Dim\s+(?<variable_name>\w_\w_\w+_(?<expected_type>\w+)_\w+)\s+As\s+(?:(?<matched_type>\2)|(?<mismatched_type>\w+))");
if (m.Success)
{
if (String.IsNullOrEmpty(m.Groups["matched_type"].Value))
{
string errorText = String.Format(@"The type '{0}' is not allowed for variable name '{1}'.
The correct type is '{2}'",
m.Groups["mismatched_type"].Value,
m.Groups["variable_name"].Value,
m.Groups["expected_type"].Value);
}
}
else
{
}
modified 18-Apr-16 1:30am.
|
|
|
|
|
Hello George,
That's very helpful.
Is there a way to get a result running this interactively in Visual Studio rather than via code? I am basically trying to find non-conformances and fix them manually, one by one.
Many thanks.
Keith
|
|
|
|
|
I suppose you could use the Code Analysis part of Visual Studio and add your own rule set.
However, if you can edit the rules depends on the version of VS you have.
Quote: In Visual Studio Ultimate, Visual Studio Premium, and Visual Studio Professional, you can create and modify a custom rule set to meet specific project needs associated with code analysis
See Using Rule Sets to Group Code Analysis Rules[^]
Another way could be to create your own Custom Tool that run through your code and show you the violations in a dialog box or something.
This is includes quite a bit of work and some understanding of COM programming.
See Implementing Single-File Generators[^] for more information.
This Codeproject article might also be helpful: Writing a Single-File Generator[^]
|
|
|
|
|
Thanks for the ideas George. I have managed to achieve the desired result in the Visual Studio Find And Replace dialog, with the following regular expression:
_Scalar_+(Boolean|Byte|Char|Date|Decimal|Double|Integer|Long|Object|SByte|Short|Single|String|UInteger|ULong|User\-Defined|UShort)_[A-Za-z0-9]+ As (?!\1)\w+
However, I am still trying to fix the problem that I describe in the following link:
<a href="http://forums.devshed.com/regex-programming-147/nested-pattern-matching-visual-studio-2015-a-973751.html">http://forums.devshed.com/regex-programming-147/nested-pattern-matching-visual-studio-2015-a-973751.html</a>[<a href="http://forums.devshed.com/regex-programming-147/nested-pattern-matching-visual-studio-2015-a-973751.html" target="_blank" title="New Window">^</a>]
KR,
Keith
|
|
|
|
|
I think you got the answer already at that site.
The expression, as given in the answer,
Dim L_V_Scalar_([^_\s]+)_([^\s]+) As (?>(\w+\.)*)(?!\1)\w+
should work for you
|
|
|
|
|
Thank you George. Yes it worked.
|
|
|
|
|
i thought it would be simpler, however i wasnt able to get regular expression for the below.
i want to extract the words which are not in a particular pattern.
say in the below sentence
clicking the (Run Match) button (or F5) to see what (happens).
i want to extract all the words which are not defined in brackets (). so the output will be
clicking
the
button
to
see
what
below is expression which i defined. it is not working. can any one point out the mistake in the expression ?
(?!\(\w+\))
modified 1-Mar-15 0:33am.
|
|
|
|
|
You haven't mentioned which language you're using. Different languages have different implementations of Regular Expressions, which support different features.
The following should work in .NET:
(?<!\()\b\w+\b(?!\))
This uses zero-width negative assertions[^] to ensure that the word doesn't start with an opening parenthesis, or end with a closing parenthesis.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
How about if we just drop all the words in (parens) and display what's left?
<br />
#!/usr/bin/perl<br />
use strict;<br />
use warnings;<br />
<br />
my(@a,@b,$i,$j,$k,$s,$t);<br />
my(@out,$ins,$outs);<br />
<br />
$ins="clicking the (Run Match) button (or F5) to see what (happens). ";<br />
print "\n";<br />
$outs=$ins;<br />
$outs=~s/\(.+?\)
$outs=undupespace($outs);<br />
print "$outs\n";<br />
<br />
exit; # Exit main pgm.<br />
###################################################################<br />
sub undupespace<br />
# Remove dupe spaces. Max 1 consecutive space.<br />
{my($l)=@_;<br />
<br />
$l=~s/ {2,}/ /g;<br />
return $l; # undupespace<br />
<br />
}<br />
<br />
Output:
clicking the button to see what .
|
|
|
|
|
Hi,
Now I need a pattern to detect last name possibilities. I think this pattern will be slightly more complicated. Names that I see in the database are like:
Jones
Jones-Smith
Jones Smith (no hyphen)
O'Leary
Van Allen (no hyphen)
Vander Ark (no hyphen)
I think that this pattern will work but I would like public opinion to make sure I am getting this right:
^[a-zA-Z\-\s']+$
Can you think of any last names where this will not work? In testing it seems to work out alright.
Thanks,
Rob
|
|
|
|
|
Don't forget the characters that include diacritical marks.
E.g., ö Å ç
A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.
|
|
|
|