- From: Ian Hickson <ian@hixie.ch>
- Date: Mon, 20 Jul 2009 08:56:48 +0000 (UTC)
- To: "Dr. Olaf Hoffmann" <Dr.O.Hoffmann@gmx.de>
- Cc: public-html-comments@w3.org
On Mon, 6 Jul 2009, Dr. Olaf Hoffmann wrote: > > in the current draft are mentioned in 2.8 > http://www.w3.org/TR/2009/WD-html5-20090423/infrastructure.html#character-encodings-0 > some 'willful' misinterpretations of encoding information, for example > to interprete a string like 'ISO-8859-1' as 'Windows-1252'. > > 1. Which string has an author to note, if he really wants to indicate, that > the encoding is for example 'ISO-8859-1' and not 'Windows-1252'? "ISO-8859-1". If the author has really used that encoding, then there is no difference between them (1252 is a superset). > 2. As far as I have seen, HTML5 has no version indication like previous > versions of HTML had and other popular formats like SVG have. > How can a browser identify, that a document is really intended as > 'HTML5' with the implicated 'willful' misinterpretations of encoding > information and no other HTMLversion? It doesn't matter, all versions of HTML are in practice processed with these mappings. It is indeed why HTML5 has these mappings -- because browsers already did this. We wouldn't add these mappings if we didn't have to to handle legacy content (content in previous versions of HTML). > Assuming that a viewer is able to identify a document somehow being a > HTML5 document after looking into the content and for example a server > sended 'ISO-8859-1' before, does this mean, that the viewer switches to > or reparses the document with 'Windows-1252' again? I don't understand the question. > Obviously it would be better to avoid such misinterpretation by using an > encoding like UTF-8 not confused by the current HTML5 draft, however due > to the history of older projects or server configurations it might be > still convenient for many authors to continue to use 'ISO-8859-1' > instead of other encodings, even if they switch for example from HTML4 > to HTML5 for some documents. Hopefully my answers above will reassure you that this is not in fact a problem that authors will face. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Monday, 20 July 2009 08:57:25 UTC