- From: <bugzilla@jessica.w3.org>
- Date: Tue, 20 Sep 2011 17:48:36 +0000
- To: public-qt-comments@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=14227 Summary: Full Text language option should address synonymy Product: XPath / XQuery / XSLT Version: Working drafts Platform: PC OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Full Text 3.0 AssignedTo: holstege@mathling.com ReportedBy: cmsmcq@blackmesatech.com QAContact: public-qt-comments@w3.org In the joint call of 20 September I was asked to raise a bug against Full Text's description of the language option. Specifically, the text of the section on the language option needs to address the question of what to do when there are both two- and three-letter codes for a language (i.e. which should be used?) The text of any description of the feature names used for language support, as sketched in Mary Holstege's mail at http://lists.w3.org/Archives/Member/w3c-xsl-query/2011Sep/0224.html may also need to address this question -- at the very least it should be consistent with the language option. The value of the language option is required to be castable to xs:language, which means that its semantics eventually are based on RFC 3066 (in XSD 1.0) or its successor BCP 47 (in XSD 1.1). BCP 47 already addresses the question of preferring the two- or three-letter codes; it describes rules for a Preferred-Value field in the IANA Language Subtag Registry. So in some sense, if we assume that the recommendations of BCP 47 are binding on the formulation of values for the language option and features, we may infer that FT already addresses the topic and there is not really any bug here. Empirically, however, today's call provides some evidence for the claim that the FT spec does not make its position on the matter adequately clear. So perhaps it would be a good idea if the description of the language option, and the description of the class of feature names based on the language option, were to mention explicitly that where the relevant RFCs define more than one code for a language or language-locale combination, the provisions of BCP47 regarding preferred values SHOULD be followed. It would be nice if we could then say "For example, prefer 'deu' to 'de'", or "For example, prefer 'de' to 'deu'" -- that would require that someone actually wade through the details of BCP47 and come out the other side with an answer to that question. It might also be helpful to remind readers (with an example, or in a note) that the values of the language option might include codes like 'en-US', 'en-CA', and 'en-GB' for a hypothetical implementation with three different tokenizers for U.S. English, Canadian English, and British English. Note: I think 'en-GB' is the right way to say 'British English' but if it's not, please substitute the correct way to say it. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Tuesday, 20 September 2011 17:48:42 UTC