L. H. Silli wrote: > > I was testing your product again. And in all the tests I made, I found > that it deletes the Byte-Order Mark (BOM).
That's right. > It also does not use the BOM > to interpret the encoding. XXE uses the Xerces XML parser and to my knowledge Xerces uses the BOM to interpret the encoding. If this is not the case, then please report this issue as a bug. > If the encoding is US-ASCII or an 8-bit > legacy encoding, then it is a fatal error (per the XML 1.0 spec) to > have the BOM before teh XML declaration if the document is not UTF-8 > encoded. This never, ever, happens with files created by our product. If you have found that this actually happens, then please report this issue as a bug. > > I bring this up because, again, I'm interested in producing > HTML-compatible XHTML documents, and the BOM is the only way encoding > indicator that is works in both XML and HTML. Sorry but we don't agree. We are going next week to release v4.9.1 which allows to create HTML-compatible XHTML documents and v4.9.1 does not use a BOM for that. Excerpts from the release notes of v4.9.1: --- XHTML files are now saved differently than in previous releases: * If you want to omit the XML declaration (that is, <?xml version="1.x"...?>) from the save file, then add <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"/> to the head element. For the XML declaration to be omitted, the media type must be "text/html" and the charset must be "UTF-8". This is useful because both the XML declaration and the <!DOCTYPE> declaration have an effect on the behavior of Web browsers. See Activating Browser Modes with Doctype (http://hsivonen.iki.fi/doctype/). * If you just want to force the encoding of a specific XHTML document to be, for example, Windows-1250 without having to tweak Options|Preferences, Save options, then add for example, <meta http-equiv="Content-Type" content="application/xhtml+xml;charset=Windows-1250"/> to the head element. --- > > So, if you bring the XML editor in line with XML 1.0, > you will also make more suitable to produce HTML. To our knowledge, our XML editor is inline with XML 1.0 and XML 1.1. We just have chosen not to add an UTF-8 BOM at the beginning of the XML files encoded in UTF-8 that we create. In the case of a document encoded using UTF-8, the BOM is never really needed. When there is no <?xml encoding="XXX"?> or BOM, then the parser automatically defaults to UTF-8. Excerpts from Extensible Markup Language (XML) 1.0 (Fifth Edition) http://www.w3.org/TR/xml/ --- In the absence of information provided by an external transport protocol (e.g. HTTP or MIME), it is a fatal error for an entity including an encoding declaration to be presented to the XML processor in an encoding other than that named in the declaration, or for an entity which begins with neither a Byte Order Mark nor an encoding declaration to use an encoding other than UTF-8. Note that since ASCII is a subset of UTF-8, ordinary ASCII entities do not strictly need an encoding declaration. --- Now, we may be wrong and in such case, we would be grateful if you could point us the the specification which mandates to add such UTF-8 BOM. -- XMLmind XML Editor Support List xmleditor-support@xmlmind.com http://www.xmlmind.com/mailman/listinfo/xmleditor-support