Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Performance considerations for strings in C#

0.00/5 (No votes)
8 May 2005 2  
The effects of string handling on performace.

Introduction

How you handle strings in your code can have surprising effects on performance speed. In this article, I shall look at two of the common issues that using strings can produce: use of temporary string variables and string concatenation.

Background

There comes a time in every project when you have to start looking at coding standards. Using FxCop is a good place to start. My favourite set of FxCop rules is the 'Performance' set.

So there I was, checking my project against FxCop and seeing lots of issues with strings. I must admit something: I have always had problems with C#'s immutable strings. When I see myString.ToUpper(), I always forget that it won't change the contents of myString but will return a new string entirely (this is because strings are immutable in C#).

I proceeded to fix my code to remove FxCop's warnings and then I noticed something - my code was faster. I decided to investigate and ended up writing the test code that I present here.

Using the code

The test code is very simple. A console application calls four test methods. Each method performs a string processing routine 1000 times (so the time to execute is nice and long to look at performance differences).

The four test methods are split into two groups of two. The first group compares case-insensitive string comparison.

String Comparison and Temporary String Creation

The first test routine is a bad case-insensitive string comparison. The routine for the comparison is:

static bool BadCompare(string stringA, string stringB) 
{
    return (stringA.ToUpper() == stringB.ToUpper());
}

For this code, FxCop shows the following advice:

"StringCompareTest.BadCompare(String, String):Boolean calls 
   String.op_Equality(String, String):Boolean after converting 'stack1', a local, 
   to upper or lowercase. If possible, eliminate the string creation and call the 
   overload of String.Compare that performs a case-insensitive comparison."

What this means is that each call to ToUpper() is creating a temporary string which has to be created and managed by the garbage collector. This takes extra time and uses more memory. The String.Compare method is more efficient.

The second test routine uses String.Compare:

static bool GoodCompare(string stringA, string stringB)
{
    return (string.Compare(stringA, stringB, true, 
         System.Globalization.CultureInfo.CurrentCulture) == 0);
}

This method prevents the creation of unnecessary temporary strings.

According to nprof, the Good Comparison takes 1.69% of the total execution time of the code, while the Bad Comparison takes 5.50% of the total execution time.

So the String.Compare method is over three times as fast as the ToUpper method. If you have code that is performing a lot of string comparisons (especially in a loop) then using String.Compare can make a big difference.

String Concatenation inside a loop

The final pair of test routines consider string concatenation within a loop.

The 'bad' test routine is as follows:

static string BadConcatenate(string[] items)
{
    string strRet = string.Empty;

    foreach(string item in items)
    {
        strRet += item;
    }

    return strRet;
}

When FxCop sees this code, it is so outraged that it even marks the broken rule in red! FxCop says the following:

"Change StringCompareTest.BadConcatenate(String[]):String to use StringBuilder 
  instead of String.Concat or +="

The 'good' test routine was written as follows:

static string GoodConcatenate(string[] items)
{
    System.Text.StringBuilder builder = new System.Text.StringBuilder();

    foreach(string item in items)
    {
        builder.Append(item);
    }

    return builder.ToString();
}

This is an almost archetypal example given for the use of the System.Text.StringBuilder class. The issue with the bad example is the creation of more temporary strings. Because strings are immutable, the concatenation operator (+=) actually creates a new string out of the two originals and then points the original string instance at the new string.

However, when we look at performance, according to nprof, the we find that the 'Bad' concatenation takes 5.67% of the total execution time, while the 'Good' concatenation takes 22.09%. I'll run that by you again:

Using StringBuilder took almost four times longer than simple string concatenation!

Why?

The answer is partly in the design of the test; the concatenation routines only concatenate ten short strings. The StringBuilder class is a more complex class than a simple immutable string, so creating one StringBuilder is more expensive in performance than doing ten simple string concatenations.

I repeated the test with differing numbers of string concatenations, and found the following results:

Chart of concatenation method effect on relative performance.

Note: The values shown here are the % of the total execution time taken by the test routines. The 'Good Concatenation' test is not actually getting faster, but takes less relative time than the 'Bad Concatenation' routine.

So, it would seem that the StringBuilder class is only really faster if you are concatenating more than about 600 strings.

Of course, the other reason for the use of the StringBuilder class is memory allocation. Using the CLRProfiler produced the following memory use timeline for concatenation of 100 simple strings:

Memory usage timeline.

The section marked 'A' shows the effect of the bad string concatenation routine on memory allocation and de-allocation. The maximum allocated memory is increasing rapidly and there is a high number of garbage collections occurring (roughly 215 collections for this section).

The section immediately following the 'A' section shows the memory profile for the good string concatenation routine. The maximum allocated memory is increasing less rapidly and there are far fewer garbage collections being made (roughly 60 collections for this section).

So using the StringBuilder class may not be faster in some cases, but it is kinder to the garbage collector.

Conclusions

Use the String.Compare method for case-insensitive string comparison. It's just faster. Nice and simple.

Use the StringBuilder class for speed increases only if you are concatenating more than about 600 strings within a loop. The caveat here is that the length of the strings you are manipulating may also affect the speed tradeoff, as may the effects on the Garbage Collector so you should really perform your own tests for your specific code.

Points of Interest

I was surprised at what a difference using the correct string manipulation methods made to code in the real world (although we do perform a lot of string comparisons and concatenations in my current project).

FxCop's performance rules are a good starting point for finding potentially slow code which can direct you to some easy fixes to improve code performance. Both of the issues discussed here are marked by FxCop as 'non-breaking' which means that the changes should not break any code depending on the code changed. This should be a no-brainer: a non-breaking change for performance improvements should always be made.

History

  • April 2005 - First draft of the article.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here