Converting chars to ASCII in C#

Question

4.80/5 (4 votes)

See more:

Hi guys,

I'm converting an old VB application into C#, and part of the system requires me to convert certain special characters to their ASCII equivalent.

In VB, the code is:

XML

sValue = Asc("œ")  'which gives 156

sValue = Asc("°")  'which gives 176

sValue = Asc("£")  'which gives 163

These are the correct values according to http://www.ascii-code.com/.

But when doing the same conversion in C#, the first of these values gives a strange answer.

Here is the code:

C#

As ints:

int i1 = (int)Convert.ToChar("œ");    // which gives 339

int i2 = (int)Convert.ToChar("°");    // which gives 176

int i3 = (int)Convert.ToChar("£");    // which gives 163


As bytes:

byte i1 = (byte)Convert.ToChar("œ");    // which gives 83

byte i2 = (byte)Convert.ToChar("°");    // which gives 176

byte i3 = (byte)Convert.ToChar("£");    // which gives 163

What gives?! :( I'm suspecting it's something to do with the sign bit, but I can't see what.

Many thanks

Posted 27-Dec-12 5:35am

Nick Fisher (Consultant)

Updated 24-Sep-21 5:08am

Add a Solution

Comments

Sergey Alexandrovich Kryukov 27-Dec-12 22:23pm

Who told you it should be ASCII? ASCII won't work for you...
—SA

5 solutions

Solution 2

Use the GetBytes[^] of the Encoding.ASCII[^] encoding to get the characters converted to ascii.

Best regards
Espen Harlinn

Posted 27-Dec-12 5:50am

Espen Harlinn

Comments

Andreas Gieriet 27-Dec-12 21:24pm

Hello Espen,
this would remove diacritics by mapping the windows code page 1252 characters to 7-bit ASCII instead of converting to unicode encoding. See also Solution #3 and #4.
Cheers
Andi

Espen Harlinn 28-Dec-12 7:39am

OP asked for ASCII, repeatedly ...

And as you wrote in your answer - you're doing a conversion to code page 1252, which is what OP actually needed, but it wasn't what he asked for.

Andreas Gieriet 28-Dec-12 8:06am

Hello Espen,
I focussed more on his example code and felt that asking for ASCII ist wrong...
It's interesting though, that converting to ASCII results in removing diacritics (œ --> o) - that was new to me.
Cheers
Andi

Sergey Alexandrovich Kryukov 27-Dec-12 22:21pm

Sorry, but won't work in this case. You probably answered formally, but did not look at the characters themselves. Please see the correct solution #4 and my comments.
(I did not vote this time.)
—SA

Solution 1

C# uses Unicode rather than ASCII to represent characters and strings.

Posted 27-Dec-12 5:44am

Richard MacCutchan

Solution 5

byte i1 = (byte)Convert.ToChar("œ");

C# uses unicode and unicode of 'œ' is 339 in both cases. (byte and int)

As we know that the range of this byte is from 0-255 so it can't hold as it is unsigned in C# but unicode of "œ" character is 339 so the Unicode value is overflowing range of byte. But as we are not concerned with overflow or underflow So there exists a pattern on which overflow value is stored in byte
Range of byte = 2^8 = 256
In case of overflow : (339 - 256 = 83 )
Now 83 is storing in a byte.

There is a way to check overflow and underflow.
byte i1 = checked((byte)Convert.ToChar("œ"));
Now using checked you will get a runtime exception which is System.overflow exception.
And u know Exception handling ......!

Posted 23-Mar-15 10:40am

Member 11549104

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Thomas Daniels · Accepted Answer · 2012-12-27T05:54:00

Solution 3

Richard is right. To get the same bytes in C# as the bytes in VB, use this:

C#

byte i1 = Encoding.Default.GetBytes("œ")[0];

The GetBytes method returns a byte array, with Encoding.Default.GetBytes("œ")[0] you get the first value of the byte array.

Hope this helps.

Posted 27-Dec-12 5:54am

Thomas Daniels

v2

Comments

Nick Fisher (Consultant) 28-Dec-12 5:45am

Yes, this works now. Many thanks. Nick

Thomas Daniels 28-Dec-12 7:45am

You're welcome!

Deki syahputra 19-Apr-15 21:38pm

Many Thanks Bro.

Andreas Gieriet · Accepted Answer · 2012-12-27T15:17:00

Hello Nick,

What you refer to as being ASCII is *not* ASCII (see http://en.wikipedia.org/wiki/ASCII[^]).
Only the 7-bit ASCII character encoding is unambiguously given.

There exist several 8-bit extensions to the original 7-bit encoding.

Your page claims to list œ as being part of latin-1. But reding carefully, the page says

[...] The extended ASCII codes (character code 128-255)
There are several different variations of the 8-bit ASCII table. The table below is according to ISO 8859-1, also called ISO Latin-1. Codes 129-159 contain the Microsoft® Windows Latin-1 extended characters. [...]

Microsoft decided some years ago to "modify" the standard to fit their needs. See http://www.cs.tut.fi/~jkorpela/chars.html[^] or more specific on http://www.cs.tut.fi/~jkorpela/chars.html#win[^].

Standard Latin-1 does *not* contain œ. That is included in Latin-9 (also known as ISO/IEC-8859-15), see also ISO Latin 9 as compared with ISO Latin 1[^] and http://en.wikipedia.org/wiki/ISO/IEC_8859-15[^].

Now, how to solve your issue?
Neither latin-1 nor latin-9 works on Windows.
You need to take Encoding.GetEncoding(1252) which happens to be the same result as calling Encoding.Default (as ProgramFOX[^] described in Solution #3).

Cheers
Andi

Converting chars to ASCII in C#

5 solutions

Solution 3

Solution 4

Solution 2

Solution 1

Solution 5

Add your solution here

Preview 0

Existing Members

...or Join us