- From: John Cowan <cowan@ccil.org>
- Date: Tue, 13 Mar 2007 16:45:12 -0400
- To: Richard Ishida <ishida@w3.org>
- Cc: www-international@w3.org
Richard Ishida scripsit: > 1. A few years ago we introduced into the XML spec the idea > that xml:lang="" conveys that 'there is no language information > available'. (See 2.12 Language Identification[2]) > > 2. An alternative is to use the value 'und', for 'undetermined'. > > 3. In the IANA Subtag Registry[3] there is another tag, 'zxx', that > means 'No linguistic content'. Perhaps this is a better choice. It > has my vote at the moment. Rightly so. The other two choices indicate slightly different flavors of ignorance about the content; if you *know* the content is nonlinguistic, you should use "zxx". > I'm not clear whether the HTML DTD supports an empty string value for > lang. If so, the presumably the validator needs to be fixed. If not, > then this is not a viable option, since you'd really want both lang > and xml:lang to have the same values. Neither the HTML 4 nor the XHTML 1.0 DTDs permit an empty value for the lang attribute; XHTML 1.0 does not permit an empty value for the xml:lang attribute either. IMHO XHTML 1.0 is obsolete in its treatment of xml:lang. Whether you want the validator to override the DTD in this respect is a question. > Would the description 'undetermined' fit this case, given that it > is not a language at all? Again, it doesn't seem right to me, since > 'undetermined' seems to suggest that it is a language of some sort, > but we're not sure which. No, it means just that: undetermined; it might be a language or it might be something else. The "und" tag should be used only if silence is not an option, when a format or protocol *insists* that a language tag be provided and the language is not known. This is not the case in XML/HTML, where one can simply omit the xml:lang and lang attributes. However, occasionally it's necessary within a stretch of XML/HTML that is language tagged, to have a portion for which the main language tag is wrong but the correct alternative is unknown. 'xml:lang=""' was introduced for this purpose. Note that this form is specific to XML; RFC 4646 itself doesn't allow zero-length language tags. -- John Cowan http://ccil.org/~cowan cowan@ccil.org In might the Feanorians / that swore the unforgotten oath brought war into Arvernien / with burning and with broken troth. and Elwing from her fastness dim / then cast her in the waters wide, but like a mew was swiftly borne, / uplifted o'er the roaring tide. --the Earendillinwe
Received on Tuesday, 13 March 2007 20:45:16 UTC