- From: Anne van Kesteren <annevk@opera.com>
- Date: Wed, 27 May 2009 11:32:48 +0200
- To: "Travis Leithead" <Travis.Leithead@microsoft.com>, "public-html@w3.org" <public-html@w3.org>, "www-international@w3.org" <www-international@w3.org>, "Richard Ishida" <ishida@w3.org>, "Ian Hickson" <ian@hixie.ch>
- Cc: "Chris Wilson" <Chris.Wilson@microsoft.com>, "Harley Rosnow" <Harley.Rosnow@microsoft.com>
On Wed, 27 May 2009 01:45:53 +0200, Travis Leithead <Travis.Leithead@microsoft.com> wrote: > A. HTML5 would no longer be vulnerable to script injection from > encodings such as UTF7 and EBCDIC which then tricks the auto- > detection code to reinterpret the entire page and run the > injected script. Opera 10 does not support UTF-7, UTF-32, and EBCDIC for Web pages, regardless of rendering mode. So far we haven't run into issues. (I'm not sure EBCDIC was ever supported and UTF-32 support might have been removed earlier on.) > B. HTML5 would be able to process markup more efficiently by > reducing the scanning and computation required to merely > determine the encoding of the file. As Henri indicates this might be possible for all pages. > C. Since sometimes the heuristics or default encoding uses > information about the user's environment, we often see pages > that display quite differently from one region to another. > As much as possible, browsing from across the globe should > give a consistent experience for a given page. (Basically, I > want my children to one day stop seeing garbage when they > browse Japanese web sites from the US.) This is something I'd like to see solved as well, but I'd really like it solved in a way that also works for the pages already deployed. > D. We'd greatly increase the consistency of implementation of > markup handling by the various user agents. These openings > for UA-specific heuristics and decisions, undermines the > benefits of standards and standardization. Yeah, ideally we document the exact algorithms used and have a fixed set of encodings user agents must support and also forbid any other encodings. Define exactly how a byte stream labeled with an encoding maps to Unicode, etc. Unfortunately I haven't found much time to look into this more. -- Anne van Kesteren http://annevankesteren.nl/
Received on Wednesday, 27 May 2009 09:33:33 UTC