Click here to Skip to main content
15,847,653 members
Articles / Programming Languages / C#

Counting lines in a string

Rate me:
Please Sign up or sign in to vote.
0.00/5 (No votes)
28 Feb 2012CPOL 18.2K   6
Great analysis!I found out that Regex can be accelerated by a factor of about two.Instead of new Regex(@"\n", RegexOptions.Compiled|RegexOptions.Multiline);you can speed up by using:new Regex(@"^.*?$", RegexOptions.Compiled|RegexOptions.Multiline);But admittedly, nothing beats...
Great analysis!

I found out that Regex can be accelerated by a factor of about two.

Instead of
new Regex(@"\n", RegexOptions.Compiled|RegexOptions.Multiline);

you can speed up by using:
new Regex(@"^.*?$", RegexOptions.Compiled|RegexOptions.Multiline);

But admittedly, nothing beats the native methods (IndexOf).

My statement above is wrong: I did compare "$" (and not "\n") against "^.*?".
The measurments show that "\n" is the fastest of all Regex matches, while "$" is the slowest (5 times slower than "\n"...!).
That's a real surprise to me.

The comparison:

Regex Match[ms] for 2.500.000 linesRegexOptions

As a comparison: IndexOf('\n') only takes 237 [ms].



This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By
Founder eXternSoft GmbH
Switzerland Switzerland
I feel comfortable on a variety of systems (UNIX, Windows, cross-compiled embedded systems, etc.) in a variety of languages, environments, and tools.
I have a particular affinity to computer language analysis, testing, as well as quality management.

More information about what I do for a living can be found at my LinkedIn Profile and on my company's web page (German only).

Comments and Discussions

GeneralRe: So Regex("^.*?$") is not faster than Regex("\n"), as you ori... Pin
Ronald M. Martin29-Feb-12 8:06
Ronald M. Martin29-Feb-12 8:06 
GeneralRe: Ah, I see your initial question. "*" is greedy match (match ... Pin
Andreas Gieriet28-Feb-12 17:48
professionalAndreas Gieriet28-Feb-12 17:48 
GeneralRe: Let me rephrase my question. Assuming that your syntax (@"^.... Pin
Ronald M. Martin28-Feb-12 4:50
Ronald M. Martin28-Feb-12 4:50 
GeneralI don't understand the use of the question mark (?) in this ... Pin
Ronald M. Martin27-Feb-12 18:27
Ronald M. Martin27-Feb-12 18:27 
GeneralRe: I simple measured a difference of a factor of about two. No ... Pin
Andreas Gieriet27-Feb-12 22:21
professionalAndreas Gieriet27-Feb-12 22:21 
I simple measured a difference of a factor of about two.
No idea why.
Obviously behaves the Regex differently when you serach for *all* occurances of a single character (in this case the '\n') compared to searching all lines (^.*?$).

As said, no clue why, I found it by chance only.
GeneralCan anyone please explain why LinesCount2 is so slow? I thou... Pin
millerize26-Feb-12 7:54
millerize26-Feb-12 7:54 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.