Click here to Skip to main content
15,890,947 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I want to know how can i change an string encoding.

Thanks in advance
Regards
Posted
Updated 15-May-11 6:38am
v4
Comments
Fabio V Silva 10-May-11 16:36pm    
What have you tried so far?
Oshtri Deka 10-May-11 16:38pm    
Have yout tried with Encoding class?
Ángel Manuel García Carmona 10-May-11 16:44pm    
I have read some info about that but the problem is that i can't understand it well. How to do it?
Fabio V Silva 10-May-11 17:03pm    
What part are you not understanding?
Do you have any code snippet where you can show which part is not working for you?
Ángel Manuel García Carmona 10-May-11 17:05pm    
I haven't any code; i'm trying to find a website that explains me how to do this process, but i can't found someone.

Essentially the process it to convert the original string into an array of bytes and the convert the array of bytes back into a string using the encoding you want. To do this you use the Encoding class, there is an example of what you are doing here[^] (near the bottom), except that it is taking ASCII and outputting UTF-8. The princple is the same though.

Note that to use this on a stream you will need to buffer, but the process is much simpler as you already have the encoded byte array from the input buffer and the Encode method returns a byte array for your output buffer

[Edit]
By looking at the comments (below) it look like the OP is having problems with html encoding. This can be decoded using this[^]
 
Share this answer
 
v3
Comments
yesotaso 10-May-11 17:28pm    
Is your original string UTF8?
string target = Encoding.UTF32.GetString(Encoding.UTF8.GetBytes(uri.AbsoluteUri.ToString()));
[Edit] hmm test results show strange stuff. just ignore this :)
BobJanova 11-May-11 5:29am    
Yes, that is trying to read UTF8 bytes directly as UTF32, without changing the format. Just putting a 'French' sticker on a book cover doesn't convert the words ;)
Keith Barrow 10-May-11 17:58pm    
Is the real problem that you need to convert /%C3%81ngel%20Manuel to Angel Manuel (with an accent on the A? If so, I've update my answer with a link that will help you
Keith Barrow 10-May-11 17:39pm    
It will be the same in both charsets, if you output it to console etc. UTF-8 is 8-bit an bytecode, all utf 16/32 do [pretty much] is to extend the number of bits used, to allow them to carry extra characters. Your newly encoded string will have the same hex values, but take up more memory IIRC.
 
Share this answer
 
Comments
Keith Barrow 10-May-11 17:32pm    
Your first example is not useful, the OP is using c#, these methods are for c++ and involve padding the 8 bit UTF-8 to 16-bit UTF-16.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900