Jeremy Quinn wrote: > > I have a bunch of text files in UTF-8 encoding which have a 'byte-order > mark' in them. > > XXE refuses to open them: "line 1, column 0, character not allowed". > > I don't have this problem with Apache Cocoon Servlet opening and > parsing them, so I did not expect this problem. > > It is handy having BOMs in the documents, because it helps certain > Applications (MacOSX) to detect that UTF-8 is being used automatically. > > Is there a workaround?
I don't see the usefulness of having a UTF-8 BOM when this encoding can be specified in the XML declaration. However XML 1.0 Second Edition Specification Errata -- http://www.w3.org/XML/xml-V10-2e-errata -- allows to do so. Therefore, this can be considered as a bug. There is no workaround other that removing the BOM. I'm afraid you'll have to wait until this bug is fixed (next release 2.1 in one or two weeks).

