What is this UTF-8 BOM stuff? I've never heard of such a thing. Given the
form of UTF-8, why would it need a BOM? Its a multi-byte encoding, so there
are no components of it larger than a byte.

--------------------------
Dean Roddey
The CIDLib C++ Frameworks
Charmed Quark Software
[EMAIL PROTECTED]
http://www.charmedquark.com

"You young, and you gotcha health. Whatchoo wanna job fer?"


----- Original Message -----
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, July 31, 2000 12:00 PM
Subject: cvs commit: xml-xerces/c/src/internal XMLReader.cpp


> aruna1      00/07/31 12:00:50
>
>   Modified:    c/src/internal XMLReader.cpp
>   Log:
>   Fixed BOM in UTF-8 files
>
>   Revision  Changes    Path
>   1.20      +15 -2     xml-xerces/c/src/internal/XMLReader.cpp
>
>   Index: XMLReader.cpp
>   ===================================================================
>   RCS file: /home/cvs/xml-xerces/c/src/internal/XMLReader.cpp,v
>   retrieving revision 1.19
>   retrieving revision 1.20
>   diff -u -r1.19 -r1.20
>   --- XMLReader.cpp 2000/07/25 22:33:05 1.19
>   +++ XMLReader.cpp 2000/07/31 19:00:48 1.20
>   @@ -55,7 +55,7 @@
>     */
>
>    /*
>   - * $Id: XMLReader.cpp,v 1.19 2000/07/25 22:33:05 aruna1 Exp $
>   + * $Id: XMLReader.cpp,v 1.20 2000/07/31 19:00:48 aruna1 Exp $
>     */
>
>



// -------------------------------------------------------------------------
--
>   @@ -1331,11 +1331,24 @@
>                break;
>            }
>
>   -        case XMLRecognizer::US_ASCII :
>            case XMLRecognizer::UTF_8 :
>            {
>   +            // If there's a utf-8 BOM  (0xEF 0xBB 0xBF), skip past it.
>   +            //   Don't move to char buf - no one wants to see it.
>   +            //   Note: this causes any encoding= declaration to
override
>   +            //         the BOM's attempt to say that the encoding is
utf-8.
>   +
>                // Look at the raw buffer as short chars
>                const char* asChars = (const char*)fRawByteBuf;
>   +
>   +            if (fRawBytesAvail > XMLRecognizer::fgUTF8BOMLen &&
>   +                XMLString::compareNString(  asChars
>   +                                            , XMLRecognizer::fgUTF8BOM
>   +                                            ,
XMLRecognizer::fgUTF8BOMLen) == 0)
>   +            {
>   +                fRawBufIndex += XMLRecognizer::fgUTF8BOMLen;
>   +                asChars      += XMLRecognizer::fgUTF8BOMLen;
>   +            }
>
>                //
>                //  First check that there are enough bytes to even see the
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>

Reply via email to