- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Mon, 29 Dec 2008 13:17:04 +0100
- To: Ian Hickson <ian@hixie.ch>
- CC: noah_mendelsohn@us.ibm.com, Arthur Barstow <art.barstow@nokia.com>, Bill McCoy <bmccoy@adobe.com>, Carl Cargill <cargill@adobe.com>, "eduardo.gutentag@oasis-open.org" <eduardo.gutentag@oasis-open.org>, "Henry.Story@Sun.COM" <Henry.Story@sun.com>, Jon Ferraiolo <jferrai@us.ibm.com>, Marcos Caceres <marcosscaceres@gmail.com>, Larry Masinter <masinter@adobe.com>, Michael Stahl <Michael.Stahl@sun.com>, Philippe Le Hegaret <plh@w3.org>, public-webapps <public-webapps@w3.org>, Richard Cohn <rcohn@adobe.com>, Svante Schubert <Svante.Schubert@sun.com>, Stephen Zilles <szilles@adobe.com>, "www-archive@w3.org" <www-archive@w3.org>, "www-tag@w3.org" <www-tag@w3.org>, www-tag-request@w3.org
Ian Hickson wrote: > The way that IE and Firefox handle bytes with values greater than 0x7F > when a file is labelled as being encoded as ASCII differs -- IE ignores > the 8th bit, and only looks at the first seven bits, whereas Firefox > treats bytes in the range 0x80 to 0xFF as being encoded as Windows-1252. > This leads to security bugs, wherein the two browsers might treat the two > strings differently (in particular, what looks like <script></script> to > IE might look like something quite different to Firefox). > > I believe the ASCII specification should have defined how to convert any > random byte stream into characters, including bytes that aren't in the > range 0-127. That it didn't means that every language that allows ASCII > has to define how to handle it, which is an abstraction violation, and > results in different specs having different rules. In many cases, the > layers above ASCII didn't define this, and we've ended up with very real > security problems, such as the example above. > > Now in the case of ASCII doing this would be trivial -- e.g. just say that > all bytes that aren't in the range 0x00 - 0x7F must be treated as 0x3F, > and say that producers must not use bytes that aren't in the table. But > yes, it should be in the ASCII spec. Your assumption seems to be that there's a single "good" way to define this error handling. I disagree with that. For instance, for XML, sending non-ASCII characters when the declared encoding is US-ASCII is a fatal error, and I definitively want to stay it that way. BR, Julian
Received on Monday, 29 December 2008 12:17:48 UTC