John L. Clark wrote: > Would it be possible to use the standard XML document encoding detection > mechanism[0] for XML documents if no encoding attribute is present on > the cfg:read directive[1]? I didn't read the documentation carefully > enough, and the current default introduced a subtle bug in one of my > configurations which I almost didn't catch. (I was generating UTF-8 XML > documents with no explicit encoding, and upon a cfg:read they were being > copied back into XXE as my system encoding.) I also think this would be > more in tune with the specification. >
"cfg:read" is designed to load any text file, not only XML. Without a <?xml version="VVV" encoding="EEE"?>, the encoding detection mechanism of XML parsers is very weak. For example, a text file encoded using platform encoding, Windows-1252 or ISO-8859-1 for example, would often be detected to be UTF-8. Technically there is no problem implementing what you suggest, except we fear that this would make "cfg:read" even more error prone that it currently is. I would rather suggest to make the "encoding" attribute of "cfg:read" mandatory. --- <cfg:read *file* = Path *encoding* = Any encoding supported by Java or "default" /> --- *mandatory attribute*

