Ian Hickson wrote: > The way that IE and Firefox handle bytes with values greater than 0x7F > when a file is labelled as being encoded as ASCII differs -- IE ignores > the 8th bit, and only looks at the first seven bits, whereas Firefox > treats bytes in the range 0x80 to 0xFF as being encoded as Windows-1252. > This leads to security bugs, wherein the two browsers might treat the two > strings differently (in particular, what looks like <script></script> to > IE might look like something quite different to Firefox). > > I believe the ASCII specification should have defined how to convert any > random byte stream into characters, including bytes that aren't in the > range 0-127. That it didn't means that every language that allows ASCII > has to define how to handle it, which is an abstraction violation, and > results in different specs having different rules. In many cases, the > layers above ASCII didn't define this, and we've ended up with very real > security problems, such as the example above. > > Now in the case of ASCII doing this would be trivial -- e.g. just say that > all bytes that aren't in the range 0x00 - 0x7F must be treated as 0x3F, > and say that producers must not use bytes that aren't in the table. But > yes, it should be in the ASCII spec. Your assumption seems to be that there's a single "good" way to define this error handling. I disagree with that. For instance, for XML, sending non-ASCII characters when the declared encoding is US-ASCII is a fatal error, and I definitively want to stay it that way. BR, JulianReceived on Monday, 29 December 2008 12:17:48 UTC
This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:56:25 UTC