Click here to Skip to main content
15,893,337 members
Home / Discussions / C / C++ / MFC
   

C / C++ / MFC

 
GeneralRe: Converting char* to unicode big-endian Pin
GKarRacer4-Aug-05 6:26
GKarRacer4-Aug-05 6:26 
GeneralRe: Converting char* to unicode big-endian Pin
scchan19844-Aug-05 16:17
scchan19844-Aug-05 16:17 
GeneralRe: Converting char* to unicode big-endian Pin
Jose Lamas Rios4-Aug-05 19:59
Jose Lamas Rios4-Aug-05 19:59 
GeneralRe: Converting char* to unicode big-endian Pin
scchan19844-Aug-05 20:41
scchan19844-Aug-05 20:41 
GeneralRe: Converting char* to unicode big-endian Pin
Jose Lamas Rios4-Aug-05 21:59
Jose Lamas Rios4-Aug-05 21:59 
GeneralRe: Converting char* to unicode big-endian Pin
scchan19844-Aug-05 22:37
scchan19844-Aug-05 22:37 
GeneralRe: Converting char* to unicode big-endian Pin
scchan19844-Aug-05 23:42
scchan19844-Aug-05 23:42 
GeneralRe: Converting char* to unicode big-endian Pin
Jose Lamas Rios5-Aug-05 6:19
Jose Lamas Rios5-Aug-05 6:19 
It's much better now Smile | :)

Note that you are allocating two buffers. You are returning one of them as the result. The other should be deleted before returning, to avoid leaking memory. Add the following line before the return:

delete[] input;

You should note that the callers of this function are responsible for deleting the result...

Besides that, it seems to be already doing what it's supposed to do. Now, if you want to do it really right, keep reading...

First I reccomend you read the following article by Joel Spolsky: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
[^]

Now that you have read it (if not, go and read it, I'll wait right here), you surely understand why a name like UnicodeCharToBigEndian[...] doesn't make much sense for what this function does (i.e.: it's not receiving an Unicode string, but a multibyte string.)

A better name would be MultiByteToBigEndianWideChar...

One additional problem with your function is that it returns a buffer that must be released by the caller. This is a problematic approach because as a user of the function, I couldn't deduce that from the function signature alone.

If all I see is a function declared in a header file as

WCHAR* MultiByteToBigEndianWideChar(char* message);

I have no way to know whether I should delete the returned buffer or not, or in case I'm expected to delete it, whether I should use delete[], free, or anything else. I'd have to rely on some documentation or on having access to the implementation itself in order to find out.

To avoid this ambiguity, it's a common practice in this kind of functions to make the caller supply the buffer and its size as parameters to the function. When the buffer is supplied by the caller, the caller already knows if that buffer is allocated in the stack or in the heap, and how to release it in the latter case. That's exactly the approach followed by MultiByteToWideChar and all the API functions.

In fact, you might start with a function with exactly the same parameters as MultiByteToWideChar

int MultiByteToBigEndianWideChar(
  UINT CodePage,         // code page
  DWORD dwFlags,         // character-type options
  LPCSTR lpMultiByteStr, // string to map
  int cbMultiByte,       // number of bytes in string
  LPWSTR lpWideCharStr,  // wide-character buffer
  int cchWideChar        // size of buffer
)
{
   int nResult = MultiByteToWideChar(
                   CodePage,
                   dwFlags,
                   lpMultiByteStr,
                   cbMultiByte,
                   cchWideChar? lpWideCharStr + 1 : 0,
                   cchWideChar? cchWideChar-1 : 0
                 );
 
   if (cchWideChar != 0)
   {
      lpWideCharStr[0] = 0xFEFF;
      WideCharToBigEndian(lpWideCharStr, nResult+1);
   }
 
   return nResult + 1;
}

When cchWideChar is 0, it calls MultiByteToWideChar with 0, and returns its result plus 1, because this function will need to add a byte order marker (BOM) at the start of the buffer).

When cchWideChar is not 0, it calls MultiByteToWideChar to make the conversion to Unicode, leaving the first position available for the BOM, then puts the BOM at the first position, and calls WideCharToBigEndian, which simply does the byte swapping in the same buffer it receives.

void WideCharToBigEndian(LPWSTR lp, int cchWideChar)
{
   for(int i = 0; i < cchWideChar; i++)
      lp[i] = ((lp[i] & 0xFF) << 8) | ((lp[i] & 0xFF00) >> 8);
}

Now your function can be used exactly like MultiByteToWideChar, including the option to query a buffer size, and with the same flexibility (i.e.: all the options in code page, flags, etc.) You can refer anyone to the documentation for MultiByteToWideChar and simply state that you add a BOM and convert the unicode string to big-endian.

Now suppose you want to provide a simplified version in which some of the options are filled with default values.
int MultiByteToBigEndianWideChar(
  LPCSTR lpMultiByteStr, // string to map
  LPWSTR lpWideCharStr,  // wide-character buffer
  int cchWideChar        // size of buffer
)
{
   return MultiByteToBigEndianWideChar(
                   CP_ACP,
                   MB_COMPOSITE, // by the way, are you sure you want this flag as default?
                   lpMultiByteStr,
                   -1,
                   lpWideCharStr,
                   cchWideChar
                 );
}


There, now you only need to compile and test. Good luck!

--
jlr
http://jlamas.blogspot.com/[^]
GeneralRe: Converting char* to unicode big-endian Pin
scchan19844-Aug-05 22:31
scchan19844-Aug-05 22:31 
GeneralRe: Converting char* to unicode big-endian Pin
Nemanja Trifunovic4-Aug-05 6:55
Nemanja Trifunovic4-Aug-05 6:55 
GeneralWinSpy for Java Applications Pin
Member 17897714-Aug-05 5:20
Member 17897714-Aug-05 5:20 
Generalftp Pin
Aditya Rao4-Aug-05 4:51
Aditya Rao4-Aug-05 4:51 
Generalquestion about DEBUG setting Pin
valerie994-Aug-05 4:07
valerie994-Aug-05 4:07 
GeneralRe: question about DEBUG setting Pin
toxcct4-Aug-05 5:11
toxcct4-Aug-05 5:11 
GeneralRe: question about DEBUG setting Pin
valerie994-Aug-05 5:26
valerie994-Aug-05 5:26 
GeneralRe: question about DEBUG setting Pin
toxcct4-Aug-05 5:28
toxcct4-Aug-05 5:28 
GeneralRe: question about DEBUG setting Pin
ThatsAlok5-Aug-05 1:42
ThatsAlok5-Aug-05 1:42 
GeneralProxy Server UDP, TCP Pin
Robert M Greene4-Aug-05 3:59
Robert M Greene4-Aug-05 3:59 
Questionhow to control windows shutdown? Pin
ThinkingPrometheus4-Aug-05 1:54
ThinkingPrometheus4-Aug-05 1:54 
AnswerRe: how to control windows shutdown? Pin
ThatsAlok4-Aug-05 2:42
ThatsAlok4-Aug-05 2:42 
GeneralRe: how to control windows shutdown? Pin
ThinkingPrometheus4-Aug-05 3:28
ThinkingPrometheus4-Aug-05 3:28 
GeneralRe: how to control windows shutdown? Pin
ThatsAlok4-Aug-05 6:26
ThatsAlok4-Aug-05 6:26 
Generaldouble clicking an icon on taskbar Pin
Ankit Aneja4-Aug-05 1:44
Ankit Aneja4-Aug-05 1:44 
GeneralRe: double clicking an icon on taskbar Pin
Eytukan4-Aug-05 2:33
Eytukan4-Aug-05 2:33 
GeneralRe: double clicking an icon on taskbar Pin
Ankit Aneja4-Aug-05 3:02
Ankit Aneja4-Aug-05 3:02 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.