- From: <bugzilla@jessica.w3.org>
- Date: Mon, 29 Nov 2010 01:00:08 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11423 --- Comment #3 from Benjamin Hawkes-Lewis <bhawkeslewis@googlemail.com> 2010-11-29 01:00:08 UTC --- (In reply to comment #2): > EUC-KR and KS_C_5601-1987 are mapped onto windows-949. I think a "must" > directive is definitely an encouragement, even if you don't. Oh, I thought by "encouraging" you referred to things a spec can realistically influence (like future authoring) as opposed to UA behaviour required to process an existing web corpus. HTML5 can't retrospectively change the corpus. Anyhow, as I read the spec, a conforming UA is free to fail to process documents labeled as EUC-KR and KS_C_5601-1987 on the basis that HTML5 maps them to Windows-949 for backwards compatibility with the web corpus, and it happens not to support Windows-949. > > > It's not like registering a character set with IANA is a particularly difficult or drawn-out process > > > > And yet Microsoft's attempt to do so (back in 2005) seems to have failed: > > > > http://mail.apps.ietf.org/ietf/charsets/msg01510.html > > Probably because, as the responses indicate, the specifications for those > character sets were insufficient and contradictory. It doesn't matter what > exactly the reason is; it's not registered. HP, IBM, and Adobe have managed to > do it, so I'm sure that it's not impossible or unreasonably difficult. Big Blue managed to do it, so it's easy? Your standard of proof may be lower than mine here. ;) > I believe "if there is one" means "if there is a name or alias labeled as > 'preferred MIME name'", not "if there is an entry in the IANA Character Sets > registry". Hmm. I think your reading is correct. :( > Even if we were to use your suggested interpretation, there are > other names for this character set, such as "CP949". How are we to know what > the preferred name is if it's not IANA-registered? > > > "User agents must at a minimum support the UTF-8 and Windows-1252 encodings, > > but may support more." > > Right, but if they support EUC-KR or KS_C_5601-1987, they are effectively > required to. (Actually, the spec seems to prohibit the useful implementation > of EUC-KR, since it's mandated that user agents use something else instead.) The spec effectively: - prohibits implementing EUC-KR or KS_C_5601-1987; - allows implementing Windows-949; - requires mapping of EUC-KR or KS_C_5601-1987 to Windows-949, but does not require UAs to actually process such documents. > > > I must therefore object to suggesting or encouraging the use of windows-949 > > > until it has been registered appropriately with IANA. > > > > Maybe try registering it? Perhaps you'll have better luck than Microsoft. > > I'm really not interested in registering what amount to platform-specific > character sets. Assuming you're interested in user agents being able to process the existing web corpus using only IANA-registered characters sets, you perhaps should have some level of interest in doing so. ;) > Finally, there are numerous character sets in existence that handle Korean just fine, > including UTF-8, and I don't see the need to add more. Which is why the spec recommends authors use UTF-8. :) http://msdn.microsoft.com/en-gb/goglobal/cc305154.aspx (which the spec references) defines an authoritative mapping of windows-949 to Unicode. If the spec simply defined the preferred name of Windows-949 as (case-insensitive) "Windows-949", could we close this bug? -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Monday, 29 November 2010 01:00:10 UTC