- From: Bert Bos <bert@w3.org>
- Date: Fri, 17 Oct 2003 11:40:35 +0200
- To: <www-style@w3.org>, "'W3c I18n Group'" <w3c-i18n-ig@w3.org>
Richard Ishida writes: > Bert Bos writes: > > Tex Texin writes: > > > For the purposes of matching, I wonder if it makes sense to > > reference > > > the RFCs at all. Isn't it really string matching based on strings > > > formatted with hyphen separators? Does any software verify that the > > > language tag contains appropriately registered codes or uses ISO > > > codes? Should it be an error, or perhaps the rule ignored, if a CSS > > > document specifies :lang(k9) since k9 is not an offical > > language code > > > or a properly formatted private code. > > > > I like that suggestion: it removes a dependency. > > > > The definition of the "|=" operator is already generic. It > > only requires a UA to split a string value at every "-" and > > doesn't require the string to be a valid language. The > > ':lang()' refers to that definition and could be made generic > > as well, e.g.: > > > > Current text in 5.11.4: > > > > The pseudo-class ':lang(C)' matches if the element is in language > > C. Here C is a language code as specified in HTML 4.0 [HTML40] and > > RFC 1766 [RFC1766]. It is matched the same way as for the '|=' > > operator. > > > > Proposed: > > > > The pseudo-class ':lang(C)' matches if the element is in language > > C. CSS doesn't define what are valid language names and the string > > C doesn't have to be a valid language name in the source document. > > It is matched the same way as for the '|=' operator. > > > I disagree with this proposed para. I think you are throwing out the > baby with the bath water. > > I see the value of referring to RFC3066 is to ensure maximum > standardisation/interoperability in the way language codes are used. > For example, 3066 requires the use of 2-letter codes rather than > 3-letter codes wherever they exist. This is important advice for > interoperability. 3066 also says that you should use ISO codes rather > than some arbitrary label where it exists. Etc. > > I think the original text was defining how one should label languages in > CSS, not just how the matching should work. And I think it is important > to retain the former, though the text could certainly be reworded so as > to separate the two ideas, remove the HTML reference and refer to > RFC3066. If I understand Richard correctly, he is suggesting that the CSS ':lang()' selector is treated semantically rather then syntactically. In other words, ':lang(en)' means "English," not "a string starting with 'en'". That's interesting, but I think it will be too complex. Consider this XML-based language, that allows text either in French (0) or English (1): <MYLITTLELANGUAGE> <WORD LANG="0">arbre</WORD> <WORD LANG="1">tree</WORD> </MYLITTLELANGUAGE> Then this style rule would turn the word "tree" green: WORD:lang(en) { color: green } Wouldn't it be better to simply *recommend* that developers use codes as per RFC 3066, even if they only need two languages? How about the text I proposed earlier, but with an additional note (i.e., not normative): The pseudo-class ':lang(C)' matches if the element is in language C. CSS doesn't define what are valid language names and the string C doesn't have to be a valid language name in the source document. It is matched the same way as for the '|=' operator. Note: It is recommended, however, that documents and protocols indicate language using codes from RFC 3066 [RFC3066] or its successor, and by means of "xml:lang" attributes in the case of XML-based documents [XML]. See "FAQ: Two-letter or three-letter language codes."[1] [1] http://www.w3.org/International/questions/qa-lang-2or3.html (replaces the 2nd para in http://www.w3.org/TR/CSS21/selector.html#lang) Bert -- Bert Bos ( W 3 C ) http://www.w3.org/ http://www.w3.org/people/bos/ W3C/ERCIM bert@w3.org 2004 Rt des Lucioles / BP 93 +33 (0)4 92 38 76 92 06902 Sophia Antipolis Cedex, France
Received on Friday, 17 October 2003 05:40:37 UTC