Click here to Skip to main content
15,881,600 members
Articles / General Programming / Regular Expressions
Alternative
Tip/Trick

Don't count spaces when counting words.

Rate me:
Please Sign up or sign in to vote.
5.00/5 (2 votes)
25 Oct 2011CPOL 7.9K   1   1
I also use a Regex expression to count words, which returns the same number of words as MS Word. I wrap the Regular Expression in a String extension method to make it easy to use.public static class StringExtensions{ /// /// WordCounts Regular Expression /// ...

I also use a Regex expression to count words, which returns the same number of words as MS Word. I wrap the Regular Expression in a String extension method to make it easy to use.


C#
public static class StringExtensions
{
  /// <summary>
  /// WordCounts Regular Expression
  /// </summary>
  private const string WordCountRegex = @"[^\s!?¡¿\-\–]+";

  /// <summary>
  /// Static WordCounts Regular Expression Object
  /// </summary>
  private static Regex regexWordCounts = new Regex(WordCountRegex, 
             RegexOptions.Compiled | RegexOptions.Multiline);
  
  /// <summary>
  /// Returns the number of words in a given <paramref name="sentence" />
  /// </summary>
  /// <param name="sentence">Text in which to count words</param>
  /// <returns>Number of words, or zero if regular expression failed</returns>
  public static int WordCounts(this string sentence)
  {
    try
    {
      MatchCollection matchCollection = regexWordCounts.Matches(sentence);
      return matchCollection.Count;
    }
    catch
    {
      return 0;
    }
  }
}

Taking the samples above, this would give the following:


C#
string input = 
  "The total number of words       \t        this sentence is 10.";
int wordCounts = input.WordCounts(); //Returns 9

input = "Mr O'Brien-Smith arrived at 8.30 and spent \t $1,000.99";
int wordCounts = input.WordCounts(); //Returns 9

Hope this helps.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
France France
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralShould "Mr O'Brien-Smith arrived at 8.30 and spent \t $1,000... Pin
FDW6-Dec-11 23:19
FDW6-Dec-11 23:19 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.