On Wed, 27 May 2009 01:45:53 +0200, Travis Leithead <Travis.Leithead@microsoft.com> wrote: > A. HTML5 would no longer be vulnerable to script injection from > encodings such as UTF7 and EBCDIC which then tricks the auto- > detection code to reinterpret the entire page and run the > injected script. Opera 10 does not support UTF-7, UTF-32, and EBCDIC for Web pages, regardless of rendering mode. So far we haven't run into issues. (I'm not sure EBCDIC was ever supported and UTF-32 support might have been removed earlier on.) > B. HTML5 would be able to process markup more efficiently by > reducing the scanning and computation required to merely > determine the encoding of the file. As Henri indicates this might be possible for all pages. > C. Since sometimes the heuristics or default encoding uses > information about the user's environment, we often see pages > that display quite differently from one region to another. > As much as possible, browsing from across the globe should > give a consistent experience for a given page. (Basically, I > want my children to one day stop seeing garbage when they > browse Japanese web sites from the US.) This is something I'd like to see solved as well, but I'd really like it solved in a way that also works for the pages already deployed. > D. We'd greatly increase the consistency of implementation of > markup handling by the various user agents. These openings > for UA-specific heuristics and decisions, undermines the > benefits of standards and standardization. Yeah, ideally we document the exact algorithms used and have a fixed set of encodings user agents must support and also forbid any other encodings. Define exactly how a byte stream labeled with an encoding maps to Unicode, etc. Unfortunately I haven't found much time to look into this more. -- Anne van Kesteren http://annevankesteren.nl/Received on Wednesday, 27 May 2009 09:33:30 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:40:33 GMT