- From: Glenn Adams <glenn@skynav.com>
- Date: Wed, 28 Mar 2012 11:05:43 -0600
- To: Anne van Kesteren <annevk@opera.com>
- Cc: public-webapps@w3.org, Boris Zbarsky <bzbarsky@mit.edu>
- Message-ID: <CACQ=j+fKQzPxuq4VVO0CFc7Ao-vHTVKxZUOQXkBRSFhD84VCcg@mail.gmail.com>
On Wed, Mar 28, 2012 at 3:50 AM, Anne van Kesteren <annevk@opera.com> wrote: > On Wed, 28 Mar 2012 08:52:25 +0100, Glenn Adams <glenn@skynav.com> wrote: > >> Well, that would define a specific, definite algorithm. Never mind that >> it would introduce random bytes into DOMStrings that may or may not have >> anything to do with character data. >> > > That's false. > What is false? At present, the inflate algorithm does not make reference to any character encoding, so it just treats the data as bytes; therefore, it is *not* well defined when no character encoding is associated with the input byte sequence. > Using iso-8859-1 is ambiguous as it is a common alias for windows-1252 > which is definitely not what we want here. I'm not sure what you mean by ambiguous. If users/servers mislabel content as 8859-1 or if they insert non-8859-1 data into byte strings that are defined to be 8859-1, then that is a usage problem, not a spec problem. My point about introducing random bytes has to do with whether the inflate algorithm is employed as is or in conjunction with a normative statement about how to (semantically) interpret the input byte string (to the inflate algorithm). If we declare (normatively, in the spec) that it is 8859-1 then the algorithm and spec are now well defined. However, absent of declaring the encoding of the input byte string, the inflate algorithm output is not semantically known. I am assuming here that neither the inflate algorithm nor the (http) client is attempting to guess/sniff the encoding of the reason status string. Or are you suggesting otherwise?
Received on Wednesday, 28 March 2012 17:06:34 UTC