- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Tue, 11 Mar 2003 08:36:26 +0000
- To: Graham Klyne <GK@ninebynine.org>
- CC: Tex Texin <tex@XenCraft.com>, Jeremy Carroll <jjc@hplb.hpl.hp.com>, www-rdf-comments@w3.org, W3c I18n Group <w3c-i18n-ig@w3.org>
RDF Core coordination: Brian please assign issue numbers for - language tag case - language ranges Please add Tex to the williams-02 IRI issue. Tex, I have specific questions for you in the text below i.e. (A) Would adding a test case(s) suffice for the language tag case issue, or do you request a note in the text? (B) Is it worth adding a postponed issue for the language range comment, or do you request further WG consideration now, or are you happy to withdraw the comment? (C) Do I18N-WG withdraw the advice that URIs in RDF should be in NFC, in favour of advice that we should defer to namespaces 1.1? Picking up Graham's initial response: Graham Klyne wrote: > Tex, > > Thank you for your comments. > > My co-editor may wish to pick up on some of your points, but meanwhile > I'll respond, as he is travelling... Thanks Graham ... > >> 1) The requirement for lang identifiers to be lowercase seems needless >> (small >> cpu savings) and dangerous. > > > I think there may be a misunderstanding here. There is no intent to > require that language identifiers be lower case in RDF documents. The > lowercasing is applied in the process of creating an RDF graph, which is > an abstract syntax upon which the RDF formal semantics is based. > > [[ > A literal in an RDF graph contains three components called: > > The lexical form being a Unicode [UNICODE] string in Normal Form C [NFC]. > The language identifier as defined by [RFC-3066], normalized to lowercase. > The datatype URI being an RDF URI reference. > ]] > > The key phrase here is *in an RDF graph*. The normalization to lower > case is applied precisely to achieve the case-insensitive comparison you > request. > > Please let me know if you think this remains an issue. In my view, it does. Tex misread the text; we should ensure that that is a misreading and not a reading. Two things we can do are: 1) add new test cases to reflect that en-US en-us and en-Us all mean the same thing. This would be easy, and unlikely to meet opposition. The test would should that in RDF/XML documents there is no expectation of case normalization on language tags; but that however you write the tag, it is the same tag. 2) [with more potential opposition] Add a note to concepts like: [[ Note: The case normalization of language tags is part of the description of the abstract syntax, and implicitly the abstract behaviour of RDF applications. It is not intended to constrain an RDF implementation to actually normalize the case. Crucially, the result of comparing two language tags should not be sensitive to the case of the original input. ]] (A) Tex, would (1) adequately address this issue; or would you strongly prefer text along the lines of (2). Historical note: earlier drafts did not normalize language tags but had an explicit case insensitive comparision (referencing RFC 3066). However, this created difficulties in the semantics doc; and the current text provided a fix. The alternative would be to have too many instances of 'case insensitve comparison of language tags' in the semantics doc. > >> 2) With respect to the rules for comparing literals: >> http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality > > > You ask for, e.g., (lang="en", str) to be equivalent to (lang="en-gb", > str). > > I would oppose this change because this behaviour is explicitly > discouraged by RFC 3066: > [[ > 2.4 Meaning of the language tag > > The language tag always defines a language as spoken (or written, > signed or otherwise signaled) by human beings for communication of > information to other human beings. Computer languages such as > programming languages are explicitly excluded. There is no > guaranteed relationship between languages whose tags begin with the > same series of subtags; specifically, they are NOT guaranteed to be > mutually intelligible, although it will sometimes be the case that > they are. > ]] > -- http://www.ietf.org/rfc/rfc3066.txt > > If you still feel that you would like the WG to consider your request, > please let us know and I will ask for it to be raised as a last-call issue. > Tex seems to assume that we provide some mechanism for supporting such comparisons which are discussed in RFC 3066. Specifically the langauge range construct of section 2.5 of RFC 3066. We don't. RDF Model & Syntax didn't. It was not in our charter to explore such mechanisms. If the I18N-WG felt it important we could add a postponed issue to the RDF issue list on this one. This would put down a marker for the future. (B) Tex, would you want a postponed issue? Do you want us to consider language ranges more widely? Or are you happy with the response, no we don't do that? >> 3) "RDF URI References" are defined and are essentially IRI. It would be >> better if the spec could simply cite the upcoming IRI spec, ... > > > This issue has already been raised to the WG as "williams-01" > [http://www.w3.org/2001/sw/RDFCore/20030123-issues/#williams-02] > A concern is that the current text partially reflects a meeting between RDF Core and I18N WGs at the Cannes plenary 2002. At that meeting the I18N WG were supportive of the constraint on IRIs that they should be in NFC. A plausible resoltion to the williams issue is to propose that we rename "RDF URI references" as "IRI" and defer to XML Namespaces 1.1. This will have the substantive change of permitting identifiers which are not in normal form C; these will be expressable in the abstract syntax, and in XML 1.0; but not in XML 1.1 (which is fully normalized). I also note that XML Namespaces 1.1 uses the term "IRI" to include IRI references. A further note to this discussion is that there is deep opposition within the WG to citing a draft from a normative document; and so Tex's apparent preference for citing the IRI draft will not gather consensus. (C) I request that I18N WG clearly indicate their preference here. > If you feel that consideration of this issue does not address your > concern, please let us know that we can raise it as a separate issue. > > ... > > In summary, I have not yet requested that any of these be raised as > formal last-call issues for the reasons given. If you find the reasons > less-than-convincing, please reply and I shall take appropriate steps to > have them considered. My summary: in the absence of a reply I will propose to the WG: - adding test cases on the language tag case issue (but not any note) - dropping the language range issue - formally requesting a response from I18N-WG on the IRI NFC sub-issue. Jeremy
Received on Tuesday, 11 March 2003 03:37:03 UTC