- From: Martin Duerst <duerst@w3.org>
- Date: Fri, 17 Oct 2003 14:27:57 -0400
- To: Bert Bos <bert@w3.org>, <www-style@w3.org>, "'W3c I18n Group'" <w3c-i18n-ig@w3.org>
At 11:40 03/10/17 +0200, Bert Bos wrote: >If I understand Richard correctly, he is suggesting that the CSS >':lang()' selector is treated semantically rather then syntactically. >In other words, ':lang(en)' means "English," not "a string starting >with 'en'". That's interesting, but I think it will be too complex. >Consider this XML-based language, that allows text either in French >(0) or English (1): > > <MYLITTLELANGUAGE> > <WORD LANG="0">arbre</WORD> > <WORD LANG="1">tree</WORD> > </MYLITTLELANGUAGE> > >Then this style rule would turn the word "tree" green: > > WORD:lang(en) { color: green } I agree that this would be too complex. But you seem to say that in the above example, WORD:lang(0) { color: green } would turn the word 'tree' green. The current wording could indeed be interpreted in that way, but it would be a rather bad idea. Regarding the following text: "For example, in HTML [HTML40], the language is determined by a combination of the "lang" attribute, the META element, and possibly by information from the protocol (such as HTTP headers). XML uses an attribute called xml:lang, and there may be other document language-specific methods for determining the language." Rather than leaving the association with HTML and XML as examples in an introductory/explanatory paragraph, the spec should very clearly say that :lang works for the lang attribute for HTML, and for the xml:lang attribute in XML. It may also say that other document formats may use :lang, but they have to define how it is applied to their format. This should be worded so that it does not give the impression that this includes XML-based languages (where we have xml:lang and don't need something else on top of it). With regards to HTML, the current text mentions META, but I think this should be removed. The META http-equivalent was designed to generate HTTP headers, but has never been used that way. The only place that I know it is used these days is for 'charset'. The above example should of course be rewritten to read: <MYLITTLELANGUAGE> <WORD xml:lang='fr'>arbre</WORD> <WORD xml:lang='en'>tree</WORD> </MYLITTLELANGUAGE> Which makes this example pointless for the point we are discussing, but actually useful for styling. >Wouldn't it be better to simply *recommend* that developers use codes >as per RFC 3066, even if they only need two languages? I think the best thing is to say that both HTML (although the spec hasn't been updated yet) and XML use RFC 3066. I do not think that the CSS spec should give recommendations to developers of new document languages. >How about the text I proposed earlier, but with an additional note >(i.e., not normative): > > The pseudo-class ':lang(C)' matches if the element is in language > C. CSS doesn't define what are valid language names and the string > C doesn't have to be a valid language name in the source document. > It is matched the same way as for the '|=' operator. Actively suggesting "doesn't have to be a valid language name" will lead people to bad ideas that we don't want them to have. Better: "CSS doesn't define what are valid language names, this is defined by the host language." (I hope 'host language' is the right term here). > Note: It is recommended, however, that documents and protocols > indicate language using codes from RFC 3066 [RFC3066] or its > successor, and by means of "xml:lang" attributes in the case of > XML-based documents [XML]. This sounds good, but is dangerous. XML defines that xml:lang is used for XML document. CSS should just say that when styling XML documents, :lang applies to xml:lang, which uses RFC 3066. There are occasionally some cases where other attributes are used to indicate language in an XML format, but CSS implementations just don't know. Regards, Martin.
Received on Friday, 17 October 2003 14:39:10 UTC