Click here to Skip to main content
15,878,852 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
How to check duplicates in string array c#..Also show a message if duplicate present....
Posted
Updated 18-Nov-20 6:15am

You can use the Distinct method. Something like:


C#
string[] myStrings = // ...

if (myStrings.Distinct().Count() != myStrings.Count())
{
    // Show message
}
 
Share this answer
 
Comments
Ian A Davidson 15-Apr-13 4:03am    
Good answer. +5.
Himanshu Yadav 20-Sep-13 22:08pm    
But how to convert in array if string is like "we are not a employee but we come in employee table" without using split('') method.......please help me out
Shmuel Zang 15-Apr-13 4:27am    
Thank you Ian.
farrukh azad 3-Sep-22 8:35am    
Nice , It work for me
There are several options. One method is:
  • First sort the array.
  • Then, foreach item: if it is equal to the next one then you have a duplicate (and you may show the message).
 
Share this answer
 
You can use Distinct() as Shmuel mentioned.
If you want to display the duplicated items , then you can use Linq
C#
 string[] array = { "First", "Second", "Third", "First", "Third" }; 
var i= CheckforDuplicates(array);


C#
public bool CheckforDuplicates(string[] array)
      {
          var duplicates = array
           .GroupBy(p => p)
           .Where(g => g.Count() > 1)
           .Select(g => g.Key);


          return (duplicates.Count() > 0);



      }


where the variable duplicates contains the list of repeated items
 
Share this answer
 
Comments
Korathu 2 15-Apr-13 4:27am    
None worked ...
Naz_Firdouse 15-Apr-13 5:13am    
wt happened?
if the variable is true,it means there is a duplicate value...
what is the issue here?
Similar to Naz's Solution:
C#
private static int CountDuplicates(string[] array)
{
  HashSet<string> stringSet = new HashSet<string>();
  int numDups = 0;
  foreach (var item in array)
  {
    if (stringSet.Contains(item))
    {
      ++numDups;
    }
    else
    {
      stringSet.Add(item);
    }
  }
  return numDups;
}

// this returns an IEnumerable with both the strings and the repeat counts (might be useful for reporting...)
private static IEnumerable<KeyValuePair<string, int>> FindDuplicates(string[] array)
{
  Dictionary<string, int> stringSet = new Dictionary<string, int>();
  foreach (var item in array)
  {
    int count;
    if (stringSet.TryGetValue(item, out count))
    {
      stringSet[item] = count + 1;
    }
    else
    {
      stringSet[item] = 1;
    }
  }
  return stringSet.Where(p => p.Value > 1);
}

The execution time of this is O(n) in the length of array.
 
Share this answer
 
Comments
aarnone2 3-Mar-18 9:15am    
Actually this is order (N log N)

The stringSet.Contains operation is of log N complexity
Matt T Heffron 3-Mar-18 12:52pm    
Nope. HashSet<t>.Contains() is O(1).
https://msdn.microsoft.com/en-us/library/bb356440(v=vs.110).aspx
        static bool hasDuplicates(string s)
        {
            int[] countOfThisCharacter = new int[256];
            bool b = false;
            bool nonunicode = false;

            for (int i=0; i<s.Length; i++)
            {
                if (((int)(s[i]))>255) { nonunicode = true;  break; }
                if ((countOfThisCharacter[s[i]])>0) { b = true; break; }
                countOfThisCharacter[((int)(s[i]))]++;
            }
            if (nonunicode==true)
            {
                b = false;
                Dictionary<string, string> lst = new Dictionary<string, string>();
                for (int i = 0; i < s.Length; i++)
                {
                    if (lst.ContainsKey(s[i].ToString())) { b = true; break; }
                    lst.Add(s[i].ToString(), s[i].ToString());
                }

            }

            return b;
        }

// this is order (N) complexity for strings without Unicode characters
// and order (N log N) complexity for strings that have Unicode characters
//
 
Share this answer
 
v2
Comments
Dave Kreskowiak 3-Mar-18 9:53am    
Asked and ANSWERED OVER FOUR YEARS AGO.
aarnone2 3-Mar-18 10:15am    
Yeah but this answer is better. It's the only O(N) way to do this.
Dave Kreskowiak 3-Mar-18 12:48pm    
Yeah, two problems.

First, you're checking for duplicated characters within a string, NOT if there are duplicate strings within an array of strings. I'll grant that the original question leaves a bit of room for interpretation as to which situation he/she was talking about.

Second, answering a really old question pops it back up to the top of the QA queue. That's something we don't like because it results in more answers to really old questions, keeping it popping up to the top of the queue.
Matt T Heffron 3-Mar-18 13:19pm    
Since the OP says "string array", there doesn't seem to be any ambiguity about the problem statement. So, as you said, this solution isn't for this problem.
Dave Kreskowiak 3-Mar-18 13:44pm    
Considering the answers given cover both cases, there DOES seem to be an ambiguity. Most answers cover the string[] case though, not the string.chars[] case.
While using linq is the best answer, here's an alternative:

C#
string text = "1,2,3,4,4,5";
string[] parts = text.Split(',');
HashSet<string> hset = new Hashset<string>();
foreach(string item in parts)
{
    hset.Add(item);
}
if (hset.Count < parts.Length)
{
   // there's a duplicate string
}


A HashSet collection does not allow duplicates, but at the same time, it will not throw an exception when you try to add a duplicate.

If you want to know WHICH string(s) is/are a duplicate, you could change the code to be something like this:
C#
string text = "1,2,3,4,4,5";
string[] parts = text.Split(',');
HashSet<string> hset = new Hashset<string>();
List<string> duplicates = new List<string>
foreach(string item in parts)
{
    int oldCount = hset.Count;
    hset.Add(item);
    if (hset.Count == oldCount)
    {
        if (!duplicates.Contains(item))
        {
            duplicates.Add(item);
        }
    }
}
if (duplicates.Count > 0)
{
   // list the duplicates 
}


Keep in mind that the code above is case sensitive, so "Text" and "text" will be seen as different values. If that matters, and it if it won't have side effects elsewhere in your code, you can always call ToUpper or ToLower on the original text varaiable before processing it for duplicates.

Lastly, punctuation in a sentence can obfuscate duplicate words, so you'd probably want to remove punctuation characters from the original string, or remove them when adding them to either the hset and the duplicates collections (in the example above).
 
Share this answer
 
v4
Comments
Richard MacCutchan 3-Mar-18 11:44am    
Uh oh John, check the date of the question.
#realJSOP 3-Mar-18 11:48am    
I don't care. :)
Richard MacCutchan 3-Mar-18 11:48am    
Rebel.
#realJSOP 3-Mar-18 11:53am    
To show that I'm willing to work within the system, I posted a suggs/bugs message:

https://www.codeproject.com/Messages/5496460/Q-A-Suggestion.aspx

:)
#realJSOP 3-Mar-18 11:56am    
I think I'm gonna go back to the first question ever posted in Q/A and answer it.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900