On Thu, Oct 18, 2012 at 1:49 AM, Anne van Kesteren <[email protected]> wrote:
> I added the API to the Encoding Standard: > > http://encoding.spec.whatwg.org/#api > > Feedback welcome. I suppose we might want to write an introduction for it > too. > > Thanks, Anne! Excellent cleanup, too. On Thu, Oct 11, 2012 at 6:37 PM, Joshua Bell <[email protected]> wrote: > > It sounds like there are several desirable behaviors: > > > > 1. ignore BOM handling entirely (BOM would be present in output, or > fatal) > > 2. if matching BOM, consume; otherwise, ignore (mismatching BOM would be > > present in output, or fatal) > > 3. switch encoding based on BOM (any of UTF-8, UTF-16LE, UTF-16BE) > > 4. switch encoding based on BOM if-and-only-if "UTF-16" explicitly > > specified, and only to one of the UTF-16 variants > > I went with supporting just 2 for now. 4 seems weird. > As per IRC discussion, if someone wants to implement this functionality it is fairly simple from script. On Thu, Oct 18, 2012 at 11:24 PM, Anne van Kesteren <[email protected]>wrote: > On Thu, Oct 18, 2012 at 4:16 PM, Glenn Maynard <[email protected]> wrote: > > On Thu, Oct 18, 2012 at 3:54 AM, Anne van Kesteren <[email protected]> > wrote: > >> * TextDecoder.decode()'s view argument is no longer optional. Why should > >> it be? > > > > It buffers the "EOF byte" when in streaming mode, eg. when the last byte > of > > the stream is a UTF-8 continuation byte, so any encode errors are > triggered. > > > >> * TextEncoder.encode()'s input argument is no longer nullable. Again, > >> why should it be? > > > > Likewise for encoding, to flush errors for trailing high surrogates. > > I made these arguments optional now (and named them both input). Note > however that the way you get the EOF byte/EOF code point is by > omitting the dictionary (whose stream member defaults to false), but I > can see how not passing any arguments as a final call is convenient. > > > https://github.com/whatwg/encoding/commit/39a201a5cdf43be3d49c6bac7952a0ecb225886b > > Yes, purely convenience. Otherwise you'd need to call: decoder.decode(buffer1, {stream: true}); decoder.decode(buffer2, {stream: true}); decoder.decode(new Uint8Array()); > > >> I also raised the issue of whether TextEncoder should really support > >> utf-16/utf-16be as the encoding standard tries to deprecate non-utf-8 > >> encodings. > > > > The whole point of this API is to support legacy file formats that use > other > > encodings. (It's probably questionable to not support other encodings, > too, > > eg. filenames in ZIP file headers, but starting out with Unicode is > fine.) > > I thought it was mostly about reading legacy formats, but fair enough. > Jonas did a straw poll via Twitter about whether enoding to UTF-16 was needed, and received positive feedback.
