On Fri, Jan 6, 2012 at 7:36 PM, Jarred Nicholls <jarred@webkit.org> wrote:
>
> Correction: rfc4627 doesn't describe BOM detection, it describes zero-byte
> detection. My question remains, though: what exactly are you doing? Do
> you do zero-byte detection? Do you do BOM detection? What's the order of
> precedence between zero-byte and/or BOM detection, HTTP Content-Type
> headers, and overrideMimeType if they disagree? All of this would need to
> be specified; currently none of it is.
>
>
> None of that matters if a specific codec is the one all be all. If that's
> the consensus then that's it, period.
>
> WebKit shares a single text decoder globally for HTML, XML, plain text,
> etc. the XHR payload runs through it before it would pass to JSON.parse.
> Read the code if you're interested. I would need to change the text
> decoder to skip BOM detection for this one case unless the spec added that
> wording of discarding when encoding != UTF-8, then that can be enforced all
> in XHR with no decoder changes. I don't want to get hung on explaining
> WebKit's specific impl. details.
>
All of the details I asked about are user-visible, not WebKit
implementation details, and would need to be specified if encodings other
than UTF-8 were allowed. I do think this should remain UTF-8 only, but if
you want to discuss allowing other encodings, these are things that would
need to be defined (which requires a clear proposal, not "read the code").
I assume it's not using the exact same decoder logic as HTML. After all,
that would allow non-Unicode encodings.
--
Glenn Maynard