John L. Clark wrote:
> Would it be possible to use the standard XML document encoding detection
> mechanism[0] for XML documents if no encoding attribute is present on
> the cfg:read directive[1]?  I didn't read the documentation carefully
> enough, and the current default introduced a subtle bug in one of my
> configurations which I almost didn't catch.  (I was generating UTF-8 XML
> documents with no explicit encoding, and upon a cfg:read they were being
> copied back into XXE as my system encoding.)  I also think this would be
> more in tune with the specification.
> 

"cfg:read" is designed to load any text file, not only XML.

Without a <?xml version="VVV" encoding="EEE"?>, the encoding detection 
mechanism of XML parsers is very weak. For example, a text file encoded 
using platform encoding, Windows-1252 or ISO-8859-1 for example, would 
often be detected to be UTF-8.

Technically there is no problem implementing what you suggest, except we 
   fear that this would make "cfg:read" even more error prone that it 
currently is.

I would rather suggest to make the "encoding" attribute of "cfg:read" 
mandatory.

---
<cfg:read
   *file* = Path
   *encoding* = Any encoding supported by Java or "default"
/>
---

*mandatory attribute*



Reply via email to