On Wednesday, Jan 15, 2003, at 10:54 Europe/London, Hussein Shafie wrote: > Jeremy Quinn wrote: >> >> I have a bunch of text files in UTF-8 encoding which have a >> 'byte-order >> mark' in them. >> >> XXE refuses to open them: "line 1, column 0, character not allowed". >> >> I don't have this problem with Apache Cocoon Servlet opening and >> parsing them, so I did not expect this problem. >> >> It is handy having BOMs in the documents, because it helps certain >> Applications (MacOSX) to detect that UTF-8 is being used >> automatically. >> >> Is there a workaround? > > I don't see the usefulness of having a UTF-8 BOM when this encoding can > be specified in the XML declaration.
it is merely a convenience for other apps, that are still struggling with to deal with UTF-8 properly. > However XML 1.0 Second Edition > Specification Errata -- http://www.w3.org/XML/xml-V10-2e-errata -- > allows to do so. Therefore, this can be considered as a bug. Thanks you. > There is no workaround other that removing the BOM. I'm afraid you'll > have to wait until this bug is fixed (next release 2.1 in one or two > weeks). Sorry to add to your workload. regards Jeremy

