- From: Bert Bos <bert@w3.org>
- Date: Thu, 16 Oct 2003 15:09:41 +0200
- To: Tex Texin <tex@i18nguy.com>
- Cc: www-style@w3.org, W3c I18n Group <w3c-i18n-ig@w3.org>
Tex Texin writes:
>
> Section 5.11.4 on :lang still references RFC 1766.
>
> Although HTML technically refers to 1766, XML has been upgraded to RFC 3066 for
> its support of 3 letter tags and also prescribes the empty tag to allow the
> removal of language info. And most browsers I believe do support rfc 3066 for
> html anyway.
>
> CSS should therefore address rfc 3066 and the empty tag as well.
>
> Does ':lang()' match elements that have language set to the empty tag?
>
> It might be thought to match all languages, since in the absence of simple
> selectors, * is presumed, so it is conceivable that absence of a tag might be
> equivalent to all. It might also be deemed to be an error to have no tag inside
> the parens.
> So the spec should address the issue.
Good point.
I think it is not useful to make ':lang()' match *all* languages,
since you can already do that by omitting the ':lang()' altogether.
Making it an error is a possibility.
But making it match elements with no language seems the most useful,
especially since it parallels RFC 3066.
>
> For the purposes of matching, I wonder if it makes sense to reference the RFCs
> at all. Isn't it really string matching based on strings formatted with hyphen
> separators? Does any software verify that the language tag contains
> appropriately registered codes or uses ISO codes? Should it be an error, or
> perhaps the rule ignored, if a CSS document specifies :lang(k9) since k9 is
> not an offical language code or a properly formatted private code.
I like that suggestion: it removes a dependency.
The definition of the "|=" operator is already generic. It only
requires a UA to split a string value at every "-" and doesn't require
the string to be a valid language. The ':lang()' refers to that
definition and could be made generic as well, e.g.:
Current text in 5.11.4:
The pseudo-class ':lang(C)' matches if the element is in language
C. Here C is a language code as specified in HTML 4.0 [HTML40] and
RFC 1766 [RFC1766]. It is matched the same way as for the '|='
operator.
Proposed:
The pseudo-class ':lang(C)' matches if the element is in language
C. CSS doesn't define what are valid language names and the string
C doesn't have to be a valid language name in the source document.
It is matched the same way as for the '|=' operator.
And in 5.8.1, in the informative reference to RFC 1766, "1766" is
replaced by "3066."
Bert
--
Bert Bos ( W 3 C ) http://www.w3.org/
http://www.w3.org/people/bos/ W3C/ERCIM
bert@w3.org 2004 Rt des Lucioles / BP 93
+33 (0)4 92 38 76 92 06902 Sophia Antipolis Cedex, France
Received on Thursday, 16 October 2003 09:09:53 UTC