W3C home > Mailing lists > Public > www-style@w3.org > October 2003

Re: CSS2.1 :lang

From: Bert Bos <bert@w3.org>
Date: Thu, 16 Oct 2003 15:09:41 +0200
Message-ID: <16270.39061.194849.106208@lanalana.inria.fr>
To: Tex Texin <tex@i18nguy.com>
Cc: www-style@w3.org, W3c I18n Group <w3c-i18n-ig@w3.org>

Tex Texin writes:
> Section 5.11.4 on :lang still references RFC 1766.
> Although HTML technically refers to 1766, XML has been upgraded to RFC 3066 for
> its support of 3 letter tags and also prescribes the empty tag to allow the
> removal of language info. And most browsers I believe do support rfc 3066 for
> html anyway.
> CSS should therefore address rfc 3066 and the empty tag as well.
> Does ':lang()' match elements that have language set to the empty tag?
> It might be thought to match all languages, since in the absence of simple
> selectors, * is presumed, so it is conceivable that absence of a tag might be
> equivalent to all. It might also be deemed to be an error to have no tag inside
> the parens.
> So the spec should address the issue.

Good point.

I think it is not useful to make ':lang()' match *all* languages,
since you can already do that by omitting the ':lang()' altogether.

Making it an error is a possibility.

But making it match elements with no language seems the most useful,
especially since it parallels RFC 3066.

> For the purposes of matching, I wonder if it makes sense to reference the RFCs
> at all. Isn't it really string matching based on strings formatted with hyphen
> separators? Does any software verify that the language tag contains
> appropriately registered codes or uses ISO codes? Should it be an error, or
> perhaps the rule ignored, if a CSS document specifies  :lang(k9) since k9 is
> not an offical language code or a properly formatted private code.

I like that suggestion: it removes a dependency.

The definition of the "|=" operator is already generic. It only
requires a UA to split a string value at every "-" and doesn't require
the string to be a valid language. The ':lang()' refers to that
definition and could be made generic as well, e.g.:

Current text in 5.11.4:

    The pseudo-class ':lang(C)' matches if the element is in language
    C. Here C is a language code as specified in HTML 4.0 [HTML40] and
    RFC 1766 [RFC1766]. It is matched the same way as for the '|='


    The pseudo-class ':lang(C)' matches if the element is in language
    C. CSS doesn't define what are valid language names and the string
    C doesn't have to be a valid language name in the source document.
    It is matched the same way as for the '|=' operator.

And in 5.8.1, in the informative reference to RFC 1766, "1766" is
replaced by "3066."

  Bert Bos                                ( W 3 C ) http://www.w3.org/
  http://www.w3.org/people/bos/                              W3C/ERCIM
  bert@w3.org                             2004 Rt des Lucioles / BP 93
  +33 (0)4 92 38 76 92            06902 Sophia Antipolis Cedex, France
Received on Thursday, 16 October 2003 09:09:53 UTC

This archive was generated by hypermail 2.3.1 : Monday, 2 May 2016 14:27:09 UTC