- From: Koji Ishii <kojiishi@gluesoft.co.jp>
- Date: Mon, 27 Aug 2012 10:16:47 -0400
- To: Glenn Adams <glenn@skynav.com>
- CC: W3C Style <www-style@w3.org>, "public-i18n-cjk@w3.org" <public-i18n-cjk@w3.org>, "ML public-i18n-core (public-i18n-core@w3.org)" <public-i18n-core@w3.org>
+public-i18n-core@w3.org as the discussion is no longer only for CJK. >>>>> The phrases "known to be X [language]" are completely undefined as far as >>>>> the current text is concerned. If you want to have one note that covers all X, >>>>> then by all means do so, but don't just leave it in such an undefined state. >>>> >>>> Did you follow the link? I think it's well-defined in Terminology section. The >>>> section also has examples you requested. >>> >>> yes; my problem is the phrase "known to be Japanese or Chinese" does not map >>> to "if the content language contains 'ja' or 'zh' or equivalent as its primary language >>> subtag". same for the phrase "known to be Turkish" which also appears in another >>> context in this document >> >> I agree that your suggested wording is easier to understand for HTML authors, but >> it's not accurate because CSS does not define what the content document format is >> and how content document determines the language. CSS Selectors Level 3[1] >> informatively recommends content document to use BCP47, but it's still content >> document that defines language syntax of the content document. >> >> The wording in our Terminology section[2] looks almost the same as the one in CSS >> Selectors Level 3 to me; it defines our syntax, but does not define content >> document syntax. It's hard for me to find good wording to improve this without >> being incorrect. >> >> If you have suggested wording, I can run it by fantasai to put into the spec. > > I'm fine with the definition under the terminology section. I'm not fine with the > "known to be X [language]" phrases. In the case of "known to be Japanese", one > might expect a UA to interpret <p lang="en">この段落は日本語です</span> as > Japanese, since you and I "know" it to be Japanese regardless of the @lang attribute. > > I would like "known to be X" to be revised to tie it to @lang (or equivalent), and not > a textual/linguistic analysis of the text that determines the actual language of the > content. I'm afraid that if we say so, questions arise like, is HTTP Content-Language header "@lang or its equivalent"? IE falls back to Tools/Options setting if no language is specified in HTML, meta, nor in HTTP, of which initial value is set by system language. Is it included to "@lang or its equivalent"? I understand your motivation to make it easier to understand, and I agree it's good. But in my understanding, if we try that, we'll be less accurate. If this is an easy thing, i18n WG won't need to write up a long best practice notes[1]. I'll ask i18n WG for any better wording suggestion. If you have good suggestion, that's appreciated too. If nobody can come up with better suggestion, I think we should conclude that the current wording is the best one. Does this sound reasonable? [1] http://www.w3.org/TR/i18n-html-tech-lang/ Regards, Koji
Received on Monday, 27 August 2012 14:17:44 UTC