- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Thu, 13 Sep 2001 14:39:37 +0100
- To: <w3c-rdfcore-wg@w3.org>
Here is a heads-up on where Bill and I have got to. ACTION: 2001-09-07#5: Jeremy Caroll Collaborate with Bill dehOra to produce analysis of the literal problem, options, pros/cons for WG consideration. We are expecting to continue this action into next week. The issues we are addressing are: 1: The Representation of Literals e.g. a pair of a Unicode String and an RFC 3066 language tag. 2: Equality a complete equality rule for literals, correcting para 217-219 and para 220, which currently leave parts of equality as undefined. 3: Preferred Mappings under rdf:parseType="Literal" (of xml document fragments into the representation of [1]). - Currently para 203 and para 220 are explicitly vague about this mapping. - We wish to specify one or more preferred mappings. - We may wish to specify more than one conformance level + motivation there are two significantly different use cases + quoting simply xhtml + quoting arbitrary XML fragments 4: Cases in which applications may expect problems. - Older RDF parsers will not be using the preferred mappings, we should indicate to document writers which XML fragments are likely to be problematic. A less ambitious approach would be to not specify a preferred mapping and simply specify a minimal requirement about some markup that should work (e.g. embedded xhtml without namespaces and without references). This would be consistent with M&S and put off the work M&S defers until RDF 2.0. Questions for WG for tomorrow: + Does the WG agree that a Literal is a <Unicode String,RFC 3066> pair? + Does the WG agree that Literal equality should be defined? + Does the WG agree that the new specs should descibe a specific Unicode string to be delivered by rdf:parseType="Literal"? Our current working text for the literal representation is: ========== An RDF Literal is* a Unicode string, optionally** paired with a language tag (as defined in RFC3066). When comparing two RDF Literals, their Unicode strings must be equal for the RDF Literals to compare as equal. If both Literals have language tags, these tags must be equal for the Literals to be considered equal. If two Literals are found equal but only one has a language tag, the Literals should not*** be considered equal. The equality of Unicode strings is specified by W3C I18N WG; see [fixme:url]. Language tag equality is defined by RFC3066 and is case insensitive. =========== * is represented as, or is a? Do we pass by reference or value :) ** equivalently we could delete 'optionally' and allow the language tag to be null, or default to "und" the ISO-639-2 undetermined language. Note, the following from RFC 3066, suggests 'Omitting'. 5. You SHOULD NOT use the UND (Undetermined) code unless the protocol in use forces you to give a value for the language tag, even if the language is unknown. Omitting the tag is preferred. ***the purpose of 'should not' is to allow applications some flexibility on dealing with language tags. That is, when a literal is equal to another but only one has a lag tag, they can be considered equivalent, which might be sufficient for some applications to make a match. The truth table corresponding to that notion of equality is: Truth table for equality (s1,t1) == string1, tag1; f* means should not be true; assume s1!=s t1!=t according to the specs in question. (s,_) (s,t) (s1,_) (s1,t1) -------------------------------------- (s,_) t f* f f (s,t) f* t f f (s1,_) f f t f* (s1,t1) f f f* t (s,t1) (s1,t) --------------- (s,t1) f* f f f t f (s1,t) f f f* f f t I am about to summarise some discussion between Bill and I on the rdf:parseType="Literal" question. Feedback welcome Jeremy
Received on Thursday, 13 September 2001 09:40:50 UTC