Dieter Maurer wrote:
Martijn Faassen wrote at 2007-1-15 15:44 +0100:

On 1/15/07, Andreas Jung <[EMAIL PROTECTED]> wrote:
ok, got it. But this problem can be solved easily by changing the encoding
within the preamble.
I would say refusing to guess and bailing out with an error message is
better in this case.

I disagree with you.

  Logically, parsing an encoded XML document consists of two
  passes: decode the encoded string into unicode and reconstruct
  the XML info elements from the serialization.

  Traditionally, these two passes are not performed one after
  the other but folded together in a single pass.
But that tradition should not prevent to separate out the
  (Unicode) decoding phase. And after this phase is done,
  there is not ambiguity left with the "XML declaration".
  Its encoding attribute is simply irrelevant for the second phase
  (apart from generating the PI info element).

That's nice as far as it goes. What if after the second phase you need to parse the XML again? What do you do with your encoding header then? If it's irrelevant, you better strip it out before you put it into the parser.



Zope3-dev mailing list

Reply via email to