Click here to Skip to main content
15,890,897 members
Articles / Programming Languages / C#
Tip/Trick

A Flexible String Trim Method

Rate me:
Please Sign up or sign in to vote.
4.83/5 (6 votes)
6 Apr 2018CPOL2 min read 7.7K   4   2
A Trim method for strings that provides flexibility without requiring the use of large character arrays

Introduction

This tip presents a technique for implementing a Trim method for strings that provides added flexibility, yet should avoid the performance degradation involved in searching large character arrays.

Background

The basic Trim, TrimStart, and TrimEnd methods of .NET's String class are fine for many uses, but may become unwieldy when an application requires more flexibility.

Without parameters (or when a null or empty array is provided), these methods remove "any leading and[/or] trailing characters that produce a return value of true when they are passed to the Char.IsWhiteSpace method". That should perform pretty well.

What has caught me out several times is that many characters I think of as whitespace (such as LINEFEED, CARRIAGE RETURN, and NULL) are not considered whitespace characters; they're control characters.

With the provided methods, if you want to omit other characters, you have to make a character array that contains all the characters you don't want. If your needs are simple, maybe a very small array might do, e.g., new char[] { ' ' , '\t' , '\n' , '\r' , '\0' } .

Even if the Trim method needs to search this array (as I suspect it does) for each character it finds until it finds a "good" character, this should be pretty quick. But, as the array grows -- such as if you want to include all of the whitespace characters and all of the control characters and who knows what else -- then performance must degrade. Obviously, you should put the more frequently used characters at the beginning of the array.
An option may be to use a HashSet if there are many characters to omit.
I have not made a concerted effort to test the performance of any of these options.

The Code

Personally, I also think that having separate methods for trimming the strings differently is pretty silly, so this code includes an alternative to that.

This enumeration is used to allow the caller to specify which ends of the string to trim.

C#
[System.ComponentModel.DescriptionAttribute("Specifies the ends of a string.")]
public enum StringEnd
{ 
  [System.ComponentModel.DescriptionAttribute("Neither end.")]
  None   = 0
, 
  [System.ComponentModel.DescriptionAttribute("The end with the lower indices.")]
  Little = 1
, 
  [System.ComponentModel.DescriptionAttribute("The end with the higher indices.")]
  Big    = 2 
, 
  [System.ComponentModel.DescriptionAttribute("Both ends.")]
  Both   = 3
}

And here is the Trim method. It's pretty simple. It requires that the caller pass in a reference to a method that returns true if the provided character is to be omitted, or false otherwise.

C#
public static partial class LibExt
{
  public delegate bool Unwanted ( char C ) ;

  public static string
  Trim
  (
    this string Victim
  ,
    StringEnd   WhichEnd
  ,
    Unwanted    Unwanted
  )
  {
    int offset = 0 ;

    if ( ( WhichEnd & StringEnd.Little ) == StringEnd.Little )
    {
      while ( ( offset < Victim.Length ) && Unwanted ( Victim [ offset ] ) ) offset++ ;
    }

    int length = Victim.Length ;

    if ( ( WhichEnd & StringEnd.Big ) == StringEnd.Big )
    {
      while ( ( length > offset ) && Unwanted ( Victim [ length - 1 ] ) ) length-- ;
    }

    return ( Victim.Substring ( offset , length - offset ) ) ;
  }
} 

Using the Code

One of the classes I'm working on this week needs to trim all whitespace and control characters (primarily SPACES and NULLs) from both ends of several strings, so I chose to do that like this:

C#
result = result.Trim 
( 
  PIEBALD.Lib.LibExt.Trim.StringEnd.Both 
, 
  delegate 
  ( 
    char C
  )
  {
    return ( System.Char.IsWhiteSpace ( C ) || System.Char.IsControl ( C ) ) ;
  }
) ;

Again, I put the test for whitespace characters before the test for control characters because there's likely to be more of them.

I could also do that with a HashSet:

C#
private static readonly System.Collections.Generic.HashSet<char> unwanted = 
  new System.Collections.Generic.HashSet<char> 
  ( new char[] { ' ' , '\t' , '\n' , '\r' , '\0' } ) ; /* This set would have all the characters, 
                                                          not just these */
C#
result = result.Trim 
( 
  PIEBALD.Lib.LibExt.Trim.StringEnd.Both 
, 
  delegate 
  ( 
    char C
  )
  {
    return ( unwanted.Contains ( C ) ) ;
  }
) ;

A small HashSet probably doesn't perform as well as a small array, but as the number of characters grows, it may become a good option.

History

  • 2018-04-06: First submitted

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
United States United States
BSCS 1992 Wentworth Institute of Technology

Originally from the Boston (MA) area. Lived in SoCal for a while. Now in the Phoenix (AZ) area.

OpenVMS enthusiast, ISO 8601 evangelist, photographer, opinionated SOB, acknowledged pedant and contrarian

---------------

"I would be looking for better tekkies, too. Yours are broken." -- Paul Pedant

"Using fewer technologies is better than using more." -- Rico Mariani

"Good code is its own best documentation. As you’re about to add a comment, ask yourself, ‘How can I improve the code so that this comment isn’t needed?’" -- Steve McConnell

"Every time you write a comment, you should grimace and feel the failure of your ability of expression." -- Unknown

"If you need help knowing what to think, let me know and I'll tell you." -- Jeffrey Snover [MSFT]

"Typing is no substitute for thinking." -- R.W. Hamming

"I find it appalling that you can become a programmer with less training than it takes to become a plumber." -- Bjarne Stroustrup

ZagNut’s Law: Arrogance is inversely proportional to ability.

"Well blow me sideways with a plastic marionette. I've just learned something new - and if I could award you a 100 for that post I would. Way to go you keyboard lovegod you." -- Pete O'Hanlon

"linq'ish" sounds like "inept" in German -- Andreas Gieriet

"Things would be different if I ran the zoo." -- Dr. Seuss

"Wrong is evil, and it must be defeated." –- Jeff Ello

"A good designer must rely on experience, on precise, logical thinking, and on pedantic exactness." -- Nigel Shaw

“It’s always easier to do it the hard way.” -- Blackhart

“If Unix wasn’t so bad that you can’t give it away, Bill Gates would never have succeeded in selling Windows.” -- Blackhart

"Use vertical and horizontal whitespace generously. Generally, all binary operators except '.' and '->' should be separated from their operands by blanks."

"Omit needless local variables." -- Strunk... had he taught programming

Comments and Discussions

 
QuestionIronic whitespace Pin
Andre_Prellwitz9-Apr-18 19:16
Andre_Prellwitz9-Apr-18 19:16 
QuestionHelpful Pin
Member 137681367-Apr-18 23:12
Member 137681367-Apr-18 23:12 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.