- From: Maciej Stachowiak <mjs@apple.com>
- Date: Mon, 01 Jun 2009 18:23:42 -0700
- To: Boris Zbarsky <bzbarsky@mit.edu>
- Cc: Leif Halvard Silli <lhs@malform.no>, "public-html@w3.org" <public-html@w3.org>, "www-international@w3.org" <www-international@w3.org>
- Message-id: <870C5609-BD64-4610-9219-229D59A6FA3F@apple.com>
On Jun 1, 2009, at 6:07 PM, Boris Zbarsky wrote: > Leif Halvard Silli wrote: > >> There is one aspect that you are - again - forgetting, and that is >> authoring tools and web servers. > > I don't think Maciej forgot anything like that. He's talking about > the proposal that was made: that HTML consumers (not producers) > default to UTF-8 whenever they see "<!DOCTYPE html>". He is clearly > talking about the case "unless the author explicitly declares an > encoding", where "author" is anything that's producing HTML. > "declares an encoding" could take the form of an HTTP header or a > <meta> tag in the HTML. > >> If complying authoring tools had to default to UTF-8 whenever >> someone select to create a HTML 5 document (much the same way that >> XML default to UTF-8/-16), then that would be a bonus and >> simplification and _motivation_ for using HTML 5. > > Presumably by "default" you mean encode it as UTF-8 and then include > the appropriate <meta> tag? That sounds like a pretty good idea to > me. > >> The next level should be that web servers defaults to sending a >> charset header which said "UTF-8" whenever they saw the HTML 5 >> doctype. > > Very few web servers look inside the document content when deciding > on headers. I don't believe the two most common ones (Apache and > IIS) do so by default.... > >> Thus we could leave the Web browser behaviour as drafted, but >> require utf-8 as default from serves and authoring tools. > > I doubt you'll hear any browser developers complaining about this! > I certainly have no objections to it. If authoring tools do in fact > behave this way, then maybe at some point (decades from now, I > suspect) we'll get to a world where we can start dropping support > for encodings that are no longer in use because the documents have > been transcoded to UTF-8 in the meantime.... Would be nice. Agreed. I have no problem with authoring tools or servers producing UTF-8 by default, as long as they explicitly flag it. In fact, HTML tooling defaulting to UTF-8 would be great! But as I understand it, the proposal on the table was to change the behavior of HTML consumers, and that I would object to. Regards, Maciej
Received on Tuesday, 2 June 2009 01:24:23 UTC