- From: Larry Masinter <masinter@adobe.com>
- Date: Sun, 31 May 2009 15:45:09 -0700
- To: Maciej Stachowiak <mjs@apple.com>
- CC: "M.T. Carrasco Benitez" <mtcarrascob@yahoo.com>, Travis Leithead <Travis.Leithead@microsoft.com>, Erik van der Poel <erikv@google.com>, "public-html@w3.org" <public-html@w3.org>, "www-international@w3.org" <www-international@w3.org>, Richard Ishida <ishida@w3.org>, Ian Hickson <ian@hixie.ch>, Chris Wilson <Chris.Wilson@microsoft.com>, Harley Rosnow <Harley.Rosnow@microsoft.com>
Changing the default charset from *something well known* to *something else* would be a bad idea -- that would be "default charset switching". But changing the charset from "known, please guess" to "UTF-8" doesn't seem like it is "default charset switching", it's "default charset setting". Setting default charset setting may not be a good reason for a version indicator, but it's a supporting reason. If there were other reasons for having a version indicator (e.g., to support authoring requirements), the version indicator could also indicate default charset UTF8. Larry -- http://larry.masinter.net -----Original Message----- From: Maciej Stachowiak [mailto:mjs@apple.com] Sent: Sunday, May 31, 2009 3:35 PM To: Larry Masinter Cc: M.T. Carrasco Benitez; Travis Leithead; Erik van der Poel; public-html@w3.org; www-international@w3.org; Richard Ishida; Ian Hickson; Chris Wilson; Harley Rosnow Subject: Re: Auto-detect and encodings in HTML5 On May 31, 2009, at 8:05 AM, Larry Masinter wrote: > I believe the stance of most of the participants in the > HTML working group is that no "version indicator" for > HTML5 is necessary, and there is no specific > "HTML5 doctype", against which newer, or stricter, > behavior can be keyed. > > If charset defaulting is a reason for having a specific > HTML5 version indicator, in order to trigger a stricter > interpretation, say, of the default charset, that would > be interesting. I think it would be pretty poor if some indicator of the document version (e.g. the doctype or as suggested by someone else a version parameter in the Content-Type header) changed the default charset. There are two reasons I say this: 1) It goes against our desire to allow for gradual adoption. If changing your doctype declaration could have the side effect of changing your charset from Windows-1252 ("Windows Latin-1") to UTF-8, that would be a serious risk of breaking upgraded documents. 2) Doctype and Content-type parameter are both opt-in mechanisms. But there's already explicit ways to opt in to UTF-8: the charset parameter on Content-type, or a <meta> tag in the document. Explicit opt-in seems better to me than implicit, since it's more likely the author will be making a change intentionally. It would be convenient if UTF-8 could be the default character set, but we can't safely apply that to legacy content, so we can't do it. Having it be the default under an opt-in doesn't really make it the default, it just adds a way to ask for UTF-8, though a subtle and implicit one. And the benefit does not seem great enough to add an additional implicit opt-in. WinLatin1 is not a broken encoding, and opting in to UTF-8 is already quite simple. Regards, Maciej
Received on Sunday, 31 May 2009 22:46:05 UTC