- From: Anne van Kesteren <annevk@annevk.nl>
- Date: Thu, 11 Oct 2012 06:09:09 +0200
- To: Joshua Bell <jsbell@chromium.org>
- Cc: WHATWG <whatwg@whatwg.org>
On Wed, Oct 10, 2012 at 7:28 PM, Joshua Bell <jsbell@chromium.org> wrote: > On Wed, Oct 10, 2012 at 6:42 AM, Anne van Kesteren <annevk@annevk.nl> wrote: >> I also still think it's kinda yucky that this API has this gigantic >> hack around what the rest of the platform does with respect to the >> byte order mark. It seems really weird to not expose the same >> encode/decode that HTML/XML/CSS/etc. use. > > IMHO the API needs to support use cases: (1) code that wants to follow the > behavior of the web platform with respect to legacy content (i.e. the > desire to self-host), and (2) code that wants to parse files that are not > traditionally "web" data, i.e. fragments of binary files, which don't have > legacy behavior and where BOM taking priority would be surprising to > developers. For #2, following the behavior of APIs like ICU with respect to > BOMs is more sensible. I believe #2 is higher priority as long as it does > not preclude #1, and #1 can be achieved by code that inspects the stream > before handing it off to the decoder. > > Practically speaking, this would mean refactoring the combined spec so that > the current BOM handling is defined for parsing web content outside of the > API rather than requiring the API to hack around it. You would still get the hack because the API requires special treatment for "utf-16". Given that per Unicode "utf-16le" and "utf-16be" outlaw the BOM, maybe a good solution would be a flag to disable BOM handling as seen by the decode algorithm? So the decoder gets a disableBOM flag that defaults to false? That would only require a special case for BOM handling on top of what there is today, which seems a fair bit cleaner. > I received feedback recently that the API is perhaps too terse right now > when dealing with streaming content, and a more explicit decode(), > decodeStream(), resetStream() might be more intelligible. Thoughts? Either way works for me. -- http://annevankesteren.nl/
Received on Thursday, 11 October 2012 04:09:39 UTC