Click here to Skip to main content
15,881,172 members
Articles / Programming Languages / C#
Alternative
Tip/Trick

Counting lines in a string

Rate me:
Please Sign up or sign in to vote.
0.00/5 (No votes)
28 Feb 2012CPOL 18.5K   6
Great analysis!I found out that Regex can be accelerated by a factor of about two.Instead of new Regex(@"\n", RegexOptions.Compiled|RegexOptions.Multiline);you can speed up by using:new Regex(@"^.*?$", RegexOptions.Compiled|RegexOptions.Multiline);But admittedly, nothing beats...
Great analysis!

I found out that Regex can be accelerated by a factor of about two.

Instead of
new Regex(@"\n", RegexOptions.Compiled|RegexOptions.Multiline);


you can speed up by using:
C#
new Regex(@"^.*?$", RegexOptions.Compiled|RegexOptions.Multiline);


But admittedly, nothing beats the native methods (IndexOf).

[EDIT]
My statement above is wrong: I did compare "$" (and not "\n") against "^.*?".
The measurments show that "\n" is the fastest of all Regex matches, while "$" is the slowest (5 times slower than "\n"...!).
That's a real surprise to me.

The comparison:

Regex Match[ms] for 2.500.000 linesRegexOptions
\n1847Compiled|Singleline
\n1851Compiled|Multiline
^.*$2282Compiled|Multiline
^.*?$5327Compiled|Multiline
$10100Compiled|Multiline


As a comparison: IndexOf('\n') only takes 237 [ms].

[/EDIT]

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Founder eXternSoft GmbH
Switzerland Switzerland
I feel comfortable on a variety of systems (UNIX, Windows, cross-compiled embedded systems, etc.) in a variety of languages, environments, and tools.
I have a particular affinity to computer language analysis, testing, as well as quality management.

More information about what I do for a living can be found at my LinkedIn Profile and on my company's web page (German only).

Comments and Discussions

 
GeneralRe: So Regex("^.*?$") is not faster than Regex("\n"), as you ori... Pin
Ronald M. Martin29-Feb-12 7:06
Ronald M. Martin29-Feb-12 7:06 
GeneralRe: Ah, I see your initial question. "*" is greedy match (match ... Pin
Andreas Gieriet28-Feb-12 16:48
professionalAndreas Gieriet28-Feb-12 16:48 
GeneralRe: Let me rephrase my question. Assuming that your syntax (@"^.... Pin
Ronald M. Martin28-Feb-12 3:50
Ronald M. Martin28-Feb-12 3:50 
GeneralI don't understand the use of the question mark (?) in this ... Pin
Ronald M. Martin27-Feb-12 17:27
Ronald M. Martin27-Feb-12 17:27 
I don't understand the use of the question mark (?) in this speedup. The question mark is supposed to represent 0 or 1 occurences of the previous element. In this case, I take the previous element to be the dot (.), representing any character, quantified by the star (*), representing 0 or more occurrences (of any character). I don't know what adding another quantifier (the question mark) means in this context or if it's even allowed. If it is allowed and actually helps, please explain what it does, since I'm clearly missing something.
GeneralRe: I simple measured a difference of a factor of about two. No ... Pin
Andreas Gieriet27-Feb-12 21:21
professionalAndreas Gieriet27-Feb-12 21:21 
GeneralCan anyone please explain why LinesCount2 is so slow? I thou... Pin
millerize26-Feb-12 6:54
millerize26-Feb-12 6:54 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.