Click here to Skip to main content
15,895,746 members
Please Sign up or sign in to vote.
5.00/5 (1 vote)
See more:
Hi,

is there an algorithm which converts high ascii codes to low ascii codes. For example: I want to convert ù to u, û to u, ò to o, Ä to A, etc..

I just ask this because i don't feel like hard coding each special char to his equivalent.

grtz,
Wannes
Posted
Comments
Richard MacCutchan 23-Jul-10 10:44am    
No, there is no algorithm. If you look at the ASCII character tables you may notice that these characters occupy somewhat random locations so they do not map directly to their non-accented characters.
Toli Cuturicu 23-Jul-10 13:59pm    
Yes, there is an algorithm. Just look at my answer.

I found this using google:
C#
public static string RemoveDiacritics(string s)
{
    string normalizedString = s.Normalize(NormalizationForm.FormD);
    StringBuilder stringBuilder = new StringBuilder();
    for (int i = 0; i < normalizedString.Length; i++)
    {
        char c = normalizedString[i];
        if (CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)
            stringBuilder.Append(c);
    }
    return stringBuilder.ToString();
}
 
Share this answer
 
If you need a very generic approach (without the use of framework functions etc.), take a look at the code table you're using and
get your chars' numeric value, then add/subtract the appropiate
number and finally get the char back from that result value.

In ASCII: A is 65 while a is 97. So just add/subtract 32 from the
chars numeric value and you're good to go
 
Share this answer
 
v2
Comments
Wannes Geysen 23-Jul-10 10:33am    
but this isn't equal for each char. ù = 151 and û = 150 and i want to convert both these chars to u = 117. In the first case, i have to substract 24 and in the second case only 23. And that is the whole clue of my question.
Sauro Viti 23-Jul-10 10:34am    
Note that 32 is represented in binary format as 00100000 so you could translate to uppercase/lowercase using bitwise & and | operators... Who initially design the ASCII code did it in a clever way!
Toli Cuturicu 23-Jul-10 13:52pm    
Reason for my vote of 1
absurdity
michaelschmitt 13-Aug-10 7:55am    
absurdity? You disqualified yourself. It was a valid answer, even if this is too complex for you or
not the best/easiest way to do it.
I see that you want to replace accented characters with the correspondant plain ones. To do it you have to treat each of them separately.

You can use String.ToLower() and String.ToUpper() and treat the accented characters separately, or you can use the Dictionary(TKey, TValue Class)[^]
 
Share this answer
 
Comments
Wannes Geysen 23-Jul-10 10:45am    
That is exactly what I'm afraid off. I rather don't produce a long hard coded list with special chars with their equivalent char.
Toli Cuturicu 23-Jul-10 14:00pm    
Reason for my vote of 1
Not true
The .NET framework already has such a method: you can use String.ToLower() and String.ToUpper().
 
Share this answer
 
Comments
Wannes Geysen 23-Jul-10 10:27am    
That does not work. the result still contains the complex chars instead of the low ascii chars.
Johnny J. 23-Jul-10 10:33am    
Reason for my vote of 1
Misunderstanding...
Wannes Geysen 23-Jul-10 10:35am    
Reason for my vote of 1
does not work
Toli Cuturicu 25-Jul-10 9:12am    
Reason for my vote of 1
not true

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900