Re: ZIP-based packages and URI references into them ODF proposal

Ian Hickson wrote:
> The way that IE and Firefox handle bytes with values greater than 0x7F 
> when a file is labelled as being encoded as ASCII differs -- IE ignores 
> the 8th bit, and only looks at the first seven bits, whereas Firefox 
> treats bytes in the range 0x80 to 0xFF as being encoded as Windows-1252. 
> This leads to security bugs, wherein the two browsers might treat the two 
> strings differently (in particular, what looks like <script></script> to 
> IE might look like something quite different to Firefox).
> 
> I believe the ASCII specification should have defined how to convert any 
> random byte stream into characters, including bytes that aren't in the 
> range 0-127. That it didn't means that every language that allows ASCII 
> has to define how to handle it, which is an abstraction violation, and 
> results in different specs having different rules. In many cases, the 
> layers above ASCII didn't define this, and we've ended up with very real 
> security problems, such as the example above.
> 
> Now in the case of ASCII doing this would be trivial -- e.g. just say that 
> all bytes that aren't in the range 0x00 - 0x7F must be treated as 0x3F, 
> and say that producers must not use bytes that aren't in the table. But 
> yes, it should be in the ASCII spec.

Your assumption seems to be that there's a single "good" way to define 
this error handling. I disagree with that.

For instance, for XML, sending non-ASCII characters when the declared 
encoding is US-ASCII is a fatal error, and I definitively want to stay 
it that way.

BR, Julian

Received on Monday, 29 December 2008 12:17:51 UTC