Click here to Skip to main content
15,901,205 members
Please Sign up or sign in to vote.
1.10/5 (3 votes)
See more:
Hi, I didn't tell you that I'm an Iraqi and ofcourse I write some arabic in my application.
My application is chat server-client. As you know that before TCP sending we should translate string to array of bytes.

C#
byte[] DataBytes = new byte[Message.Length + ("<EOP>").Length + 1];
for (int x = 0; x < Message.Length; x++)
{
    DataBytes[x] = Convert.ToByte(Message[x]);
}

DataBytes[Message.Length + 1] = Convert.ToByte('<');
DataBytes[Message.Length + 2] = Convert.ToByte('E');
DataBytes[Message.Length + 3] = Convert.ToByte('O');
DataBytes[Message.Length + 4] = Convert.ToByte('P');
DataBytes[Message.Length + 5] = Convert.ToByte('>');

clientSocket.Send(DataBytes);


When I wrote an English word (like: Hello) and send it... sending is succeeded.
But When I wrote an Arabic word (like: هلو) and send it... an error occurred tells me that "Value was either too large or too small for an unsigned byte".

What this mean???
And how to solve it???
Posted
Updated 2-Jun-18 9:38am
Comments
Sergey Alexandrovich Kryukov 10-Jan-13 16:37pm    
Do you understand that none of Arabic characters can fit in just one byte? Not just your code makes no sense, but the whole idea is wrong.
First of all, get some idea on what Unicode is...
—SA

try this

using System.Text;

byte[] bytes = Encoding.UTF32.GetBytes(iString);
 
Share this answer
 
v2
Comments
[no name] 11-Jan-13 3:41am    
Yes, It works.
Thank you guys all of you.
You can try this:
C#
string s = ... // Your string source here
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(s);
But I don't guarantee it. It may work if you use a different encoding.
 
Share this answer
 
Comments
[no name] 10-Jan-13 15:21pm    
Body, I have tried it but the when the server receive it... it look likes this "???" instead of this "هلو".
OriginalGriff 10-Jan-13 15:29pm    
Did you use the reverse process to convert it back? Remember, bytes are only 8 bit quantities, Unicode characters are (generally) 16 or 32 bits, but a charcater can be spread over several "codepoints" (Wiki can help you if you want to understand this)
So the characters have to be "translated" to bytes and then back again to reassemble the original input.
[no name] 10-Jan-13 15:49pm    
I have reverse the process in the other side but the result is "???" instead of above word.
OriginalGriff 10-Jan-13 15:54pm    
What code did you use to reverse the process?
Did you try other encoding?
Sergey Alexandrovich Kryukov 10-Jan-13 16:44pm    
Not exactly. A character can of course spread across several "words", such as 8-bit (in UTF-8) or 16-bit (UTF-16 surrogate pairs) ones, but it is still one code point; this term is used for abstract (not related to machine representation) mathematical ordering number, one-to-one corresponding to a character (also as abstracted cultural notion, not related to glyph graphics, etc.)

(Also, I'm not talking about the combined diacritical, which is a combination of characters (and hence code point). As far as I remember, this is not the case for Arabic characters.)

—SA

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900