Jeremy Quinn wrote:
> 
> I have a bunch of text files in UTF-8 encoding which have a 'byte-order
> mark' in them.
> 
> XXE refuses to open them: "line 1, column 0, character not allowed".
> 
> I don't have this problem with Apache Cocoon Servlet opening and
> parsing them, so I did not expect this problem.
> 
> It is handy having BOMs in the documents, because it helps certain
> Applications (MacOSX) to detect that UTF-8 is being used automatically.
> 
> Is there a workaround?

I don't see the usefulness of having a UTF-8 BOM when this encoding can
be specified in the XML declaration. However XML 1.0 Second Edition
Specification Errata -- http://www.w3.org/XML/xml-V10-2e-errata --
allows to do so. Therefore, this can be considered as a bug.

There is no workaround other that removing the BOM. I'm afraid you'll
have to wait until this bug is fixed (next release 2.1 in one or two
weeks).

Reply via email to