Goeff and Dean, After looking at the C RTL sources for both Borland and MSVC, wcstombs() returns -1 on errors using either compiler. The Borland documentation states "If an invalid multibyte character is encountered, wcstombs returns (size_t) -1. Otherwise, the function returns the number of bytes modified, not including the terminating code, if any."
Regarding the 2 or more byte issue: Both implementations of wcstombs rely on a compile time quantity for the maximum number of bytes that a multi-byte character can contain. Borland mbyte1.c: #define MB_MAX_CHARLEN 2 // current maximum MBCS character length MSVC limits.h: #define MB_LEN_MAX 2 /* max. # bytes in multibyte char */ Additionally, both implementations utilize a Windows API to determine the correct string length. It takes into account the current code page and how to deal with Unicode characters that don't directly translate into multi-byte. Lookup "WideCharToMultiByte" in the Platform SDK documentation for all the details. I don't know of a standard c library equivalent to WideCharToMultiByte. HTH, Don At 01:36 AM 9/28/2001 -0700, you wrote: >On 9/28/01 12:50 AM, "Dean Roddey" <[EMAIL PROTECTED]> wrote: > > > No, definitely not 2 bytes. UTF-8 can take up to 6 bytes to hold a single > > Unicode character, and others can take 3 or 4 and whatnot. You really need > > to know what the target is going to take. And you can't really afford to do > > a worst case. If they are about to transcode a large amount of text, > > allocating 6 bytes per source Unicode char would be really piggy. Those > > other platforms have to have a function to do this calculation, since its > > fundamental to doing transcoding. > >Except that wcstombs would never transcode to UTF-8...if I understand it >correctly. It transcodes to whatever encoding makes sense in the current >locale, so the question is, can a "multi-byte" string ever require more than >2 bytes per character? I know in my case it cannot because I'm always >dealing with iso_8859-1, which is always 1 byte per character. I took my >assumption above from this line in the wcstombs documentation at msdn: > >"If there are two bytes in the multibyte output string for every wide >character in the input string, the result is guaranteed to fit." > >http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/HT >ML/_crt_wcstombs.asp > > >Which applies at least to the MSVC++ implementation. Metrowerk's >implementation is actually simple-minded (it copies the low order bytes of >each wchar_t into a new char array) so as I said, for my purposes, my >assumption should be fine... > >Is there a way in the standard c library to determine the necessary length? > >Thanks, > >Geoff > > >--------------------------------------------------------------------- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
