- From: Ian Hickson <ian@hixie.ch>
- Date: Fri, 23 Oct 2009 22:25:54 +0000 (UTC)
On Fri, 23 Oct 2009, ?istein E. Andersen wrote: > On 23 Oct 2009, at 04:20, Ian Hickson wrote: > > On Wed, 21 Oct 2009, ??istein E. Andersen wrote: > > > > > > ASCII-compatibility: > > > The note in ??2.1.5 Character encodings?? seems to say that [...] > > > ISO-2022??[-*] are ASCII-compatible, whereas HZ-GB-2312 is not, and I > > > cannot > > > find anything in Section 2.1.5 that would explain this difference. > > > > HZ-GB-2312 uses the byte ASCII uses for "~" as the escape character. > > ISO-2022-* uses the control codes. That's the difference. > > '~'/0x7E is not (and should not be, as far as I can tell) relevant for HTML5's > concept of ASCII compatibility. Good point. Moved the encoding over to the other side. > The added note certainly helps, but it is vague (does "[m]ost of these > encodings" mean "all the encodings mentioned above apart from UTF-32"?) > and inaccurate (Philip Taylor's example does not rely on "bugs"). > > Given that the set of encodings is open-ended, I still think it would be > preferable to make the rationale (a definition of what makes an encoding > problematic) primary and mention actual encodings as examples. This > could give something like the following: "Encodings in which a series of > bytes in the range 0x20..0x7E may encode characters other than the > corresponding characters in the range U+20..U+7E represent a potential > security vulnerability since a browser that does not support the > encoding (or does not support the label used to declare the encoding, or > does not use the same mechanism to detect the encoding of unlabelled > content) might end up interpreting technically benign plain text content > as HTML tags and JavaScript. In particular, this applies to encodings > in which the bytes corresponding to '<script>' in ASCII may encode a > different string. Authors should not use such encodings, which are known > to include.... In addition, authors should not use UTF-32 ...." > Alternatively, fixing the current note would help and might be > sufficient, albeit not ideal. I've reworded the spec based on your suggestion. Thanks! -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 23 October 2009 15:25:54 UTC