On 03/14/2012 12:38 AM, Tab Atkins Jr. wrote:
On Tue, Mar 13, 2012 at 4:11 PM, Glenn Maynard<[email protected]>  wrote:
The API on that wiki page is a reasonable start.  For the same reasons that
we discussed in a recent thread (
http://lists.w3.org/Archives/Public/public-webapps/2011JulSep/1589.html),
conversion errors should use replacement (eg. U+FFFD), not throw
exceptions.

Python throws errors by default, but both functions have an additional
argument specifying an alternate strategy.  In particular,
bytes.decode can either drop the invalid bytes, replace them with a
replacement char (which I agree should be U+FFFD), or replace them
with XML entities; str.encode can choose to drop characters the
encoding doesn't support.

For completeness I note that python also allows user-provided custom error handling. I'm not suggesting we want this, but I would strongly prefer it to providing an XML-entity-encode option :)

Reply via email to