- From: Bert Bos <bert@w3.org>
- Date: Thu, 16 Oct 2003 15:09:41 +0200
- To: Tex Texin <tex@i18nguy.com>
- Cc: www-style@w3.org, W3c I18n Group <w3c-i18n-ig@w3.org>
Tex Texin writes: > > Section 5.11.4 on :lang still references RFC 1766. > > Although HTML technically refers to 1766, XML has been upgraded to RFC 3066 for > its support of 3 letter tags and also prescribes the empty tag to allow the > removal of language info. And most browsers I believe do support rfc 3066 for > html anyway. > > CSS should therefore address rfc 3066 and the empty tag as well. > > Does ':lang()' match elements that have language set to the empty tag? > > It might be thought to match all languages, since in the absence of simple > selectors, * is presumed, so it is conceivable that absence of a tag might be > equivalent to all. It might also be deemed to be an error to have no tag inside > the parens. > So the spec should address the issue. Good point. I think it is not useful to make ':lang()' match *all* languages, since you can already do that by omitting the ':lang()' altogether. Making it an error is a possibility. But making it match elements with no language seems the most useful, especially since it parallels RFC 3066. > > For the purposes of matching, I wonder if it makes sense to reference the RFCs > at all. Isn't it really string matching based on strings formatted with hyphen > separators? Does any software verify that the language tag contains > appropriately registered codes or uses ISO codes? Should it be an error, or > perhaps the rule ignored, if a CSS document specifies :lang(k9) since k9 is > not an offical language code or a properly formatted private code. I like that suggestion: it removes a dependency. The definition of the "|=" operator is already generic. It only requires a UA to split a string value at every "-" and doesn't require the string to be a valid language. The ':lang()' refers to that definition and could be made generic as well, e.g.: Current text in 5.11.4: The pseudo-class ':lang(C)' matches if the element is in language C. Here C is a language code as specified in HTML 4.0 [HTML40] and RFC 1766 [RFC1766]. It is matched the same way as for the '|=' operator. Proposed: The pseudo-class ':lang(C)' matches if the element is in language C. CSS doesn't define what are valid language names and the string C doesn't have to be a valid language name in the source document. It is matched the same way as for the '|=' operator. And in 5.8.1, in the informative reference to RFC 1766, "1766" is replaced by "3066." Bert -- Bert Bos ( W 3 C ) http://www.w3.org/ http://www.w3.org/people/bos/ W3C/ERCIM bert@w3.org 2004 Rt des Lucioles / BP 93 +33 (0)4 92 38 76 92 06902 Sophia Antipolis Cedex, France
Received on Thursday, 16 October 2003 09:09:53 UTC