|
I am using a software integration platform to connect to a Google calendar and read appointment events. Each event has a small description and phone number in the Summary and I need a regular expression to extract those phone numbers.
The problem is that each person entering an event, formats the phone number differently. Some phone numbers have spaces, or country prefix, and in some cases, there are other (irrelevant) numbers in the Summary also. I don’t mind keeping the country prefix if they enter it, but I do need to detect what’s the phone number.
So, here’s some information on the numbers
- If the user entered a prefix there will be a 357 or 00357 or +357 in the summary.
- After the prefix (if it exists) each phone number will start with 94 or 95 or 96 or 97 or 99.
- Then 6 more numbers will follow.
- There could also be spaces between the numbers
Eventually a correct phone number will be
99123456 or 35799123456 or +35799123456
99 can also be 94 or 95 or 96 or 97
I don’t know if this is too complicated or if it’s even possible to create a regex to correctly extract the phone numbers. Any help is appreciated.
Thanks
|
|
|
|
|
It's a good few years since I wrote RegExps and, even when I did, I could never remember whether ^ was start and $ was end or if it was the other way round.
My totally untested guess is (assuming spaces are only allowed between components and / or at the start or end) is
^\s*((\+|00)?357\s*)?9[4-79]\s*\d{6}\s*$
or allowing spaces between any numbers (and / or at the start or end) is
^\s*((\+|(0\s*){2})?3\s*5\s*7\s*)?9\s*[4-79]\s*(\d\s*){6}$
Note: I'm allowing all whitespace, to just explicitly permit spaces and exclude other white space, change all of the \s patterns to single space chars
I look forward to seeing what someone who actually knows RegExps comes up with to see if I am even remotely close.
Edit: I've just re-read the question - you want to extract the text, not just validate it. What I have written above just validates. A simple way to extract would be to, enclose the whole of the text between ^ and $ in parentheses to get a group. Extract the group and then just drop all spaces. e.g. in Javascript
theGroupThatYouHaveFound.Replace(/\s/g, ''); . An even simpler way would be to use the validation RegExp and then remove spaces from the original string (no messing around with groups).
modified 7-Apr-22 12:58pm.
|
|
|
|
|
Dear friends,
I am new to this forum. I need your expert advise on deriving a regular expression to achieve the following. I am using Editplus' search and replace. I could use Notepad++ as well, if required.
I have some files containing several thousand lines of content. I want to match lines in the content except the ones which start with a number and period (1. or 2. .......... 5000.) or the first 4 alphabets and a period (a. or b. or c. or d.). Any help is greatly appreciated. Thanks in advance.
Diamond Dallas
|
|
|
|
|
Old post but... I think this should work (for anyone who has a similar need). You can change the 1 to 0 if you want to avoid lines that begin with "0." as well. Also note that "aa." (or similar) will be matched/returned, since the request was to specifically avoid lines that start with a SINGLE letter (a, b, c, d) followed by a period. But this is easy to adjust as necessary if multiple letters followed by a period should also be skipped/avoided.
^(?![1-9]+\.|[a-d]\.).+$
|
|
|
|
|
Hello, I have problem with regex. For example I have a pattern for (12-12)(15.15)(H) Message and this format repeats itself. I am reading data from file in this format and I would like to know that is there any way to read until (12-12) pattern repeats itself. I can read it with repeats (12-12)(15.15)(H) Message (13-13)(14.14)(L) Messages . but if there is just one, my regex does not work. Also my message has new lines.
|
|
|
|
|
Hi, I was trying to make a regex that matches only if x is present power of 2 times (n=2). Asked otherwise, make a regex that matches if the length of the string is the power of 2 where the only character in the string is x in this case.
So for example,
xx (2), xxxx (4), xxxxxxxx (8) should match but xxx (3), xxxxxx (6) should not match.
Besides this, I also have a query on the regex I was trying to make:
^((xx)*){2}$
to match xx, xxxx and not xxxxxx but this matches the later (see here). From the debugger also, I cannot understand why last 2 x in 6 x matches with the inner group when we have {2} outside.
|
|
|
|
|
A regular expression is the wrong tool for this. Unless this is a homework assignment or coding challenge, there will be far simpler ways to test whether the length of a string is a power of 2.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
I know I know. Yes, it is a challenge!
|
|
|
|
|
Hello!
Can someone give me a piece of advice on how to build a regex for the following conditions -- only one capital letter in a word (must be one) n no digits.
example pass -- Apple aPple giFt
example decline -- street agentORANGE sequence1122
|
|
|
|
|
Basically, don't use a regex: they are a powerful tool, but they are text matching tools, not syntactical analysers.
Instead, use your presentation language to do that for you.
For example, in C# it is simple:
private bool HasOneCapital(string s)
{
string[] words = s.Split(' ');
foreach (string word in words)
{
if (word.Count(c => char.IsUpper(c)) != 1) return false;
}
return true;
}
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
modified 6-Mar-22 11:38am.
|
|
|
|
|
I understand you. But it's a test assignment for a position at yandex. I made 5 out of 6 questions easily (had some difficulty with SQL though). It's complicated regarding RegEx as I never used them -- just saw in linux.
|
|
|
|
|
If your position would use regexes, then go away and learn them: if you just get a solution from the internet and then get the interview, you will not be able to explain "your" solution when they ask about it and you will fail at that point as it's obvious you cheated ...
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
Should it not be:
private bool HasOneCapital(string s)
{
string[] words = s.Split(' ');
foreach (string word in words)
{
if (word.Count(c => char.IsUpper(c)) != 1) return false;
}
return true;
} (Closing parenthesis after the "IsUpper(c)" ?)
The difficult we do right away...
...the impossible takes slightly longer.
|
|
|
|
|
Yes, it should ...
Fixed.
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
Hello,
I have not worked. with regex for a long time, and I m sure I forgot something simple
In the list below
12334p4p
2345stk
5643stk43
1234db43
1234a1
1234a
I want to capture the first 4 items because the logic to NOT capture is:
Start with 1 to 8 digits
Should have 1 to 3 letters AND 2 or more digit again
I do not want to get the 2 last entries because it finishes with a single letter OR a single letter and a single digit. In other word, if it does finish with 1 letter only OR 1 letter and 1 digit, I do not want to return it
Thank you guys, I'm sure this one is easy !
modified 22-Feb-22 10:08am.
|
|
|
|
|
Did you just order a regex?
Bastard Programmer from Hell
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
That's about as basic as regex can get.
Your spec:
. start of string
. 1-8 digits
. 1-3 alpha
. 2-n digits
. end of string
Translate to regex notation (depending on your regex engine, details may vary slightly)
^[0-9]{1,8}[A-Za-z]{1,3}[0-9]{2,}$
Wrap that in your delimiter of choice, and go...
Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012
modified 22-Feb-22 5:03am.
|
|
|
|
|
Should {1-8} actually be {1,8} ?
|
|
|
|
|
Finger velocity outruns brain function. Yet again...
Fixed, thanks.
Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012
|
|
|
|
|
I have never been much good at regex, but the detail in your answer makes it easy to understand. 
|
|
|
|
|
Thanks. Like many things in coding, writing out your spec unambiguously is most of the battle. Translating it into a programming language or regex or whatever is comparatively routine.
Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012
|
|
|
|
|
Not that simple, your solution is the first thing I tried, it does not capture the 2 first entries.
Your brain went too fast 😀
|
|
|
|
|
... and I notice you have changed the question. All bets are off until you can describe exactly what you are looking for.
Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012
|
|
|
|
|
Hello, and hoping this isn't too difficult, as I know once we get the regex, these things work beautifully. I'm not even a novice, I can never figure these complex ones out. But here's the deal.
I have a capture program that captures with timestamp and a filename, which is great. Here is the format they come out as, as an example:
2022.01.30.Sun, 06h36m03s- filename 1.png
2022.01.30.Sun, 06h36m10s- filename 2.png
2022.01.30.Sun, 06h36m16s- filename 3.png
2022.01.30.Sun, 06h36m22s- filename 4.png
The change that is needed is for the timestamp portion above, to be changed from this:
06h36m03s
06h36m10s
to this, if under 9 files:
06h36.1
06h36.2
or to this is over 9 files:
06h36.01
06h36.02
06h36.03
06h36.04
06h36.05
06h36.06
06h36.07
06h36.08
06h36.09
06h36.10
06h36.11
So removined the seconds and just putting in a ".1" or ".01", etc., as needed.
I used to just go to my renamer's forum but it's been dead quite a while now. Hoping someone here can kindly help out.
Thank you in advance!
|
|
|
|
|
A regular expression to match the seconds is trivial: m\d+s .
The problem is that a regular expression works on a single string. There is no way for a pure regex solution to group the files by the start of the name and then apply a sequential number, because the regex only sees one filename at a time, and doesn't maintain any state between invocations.
If you want to do this in code, it should be fairly easy, but you'd need to specify which language you were using.
If you want to use a renamer tool, you'll need to consult the documentation to see if it can do this sort of thing, and if so, how.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|