- From: Jeremy Carroll <jjc@hpl.hp.com>
- Date: Fri, 18 Oct 2002 22:02:23 +0200
- To: w3c-rdfcore-wg@w3.org
Summary: do nothing; worry what Tim will say next. I am trying to reflect back to the WG the "advice" I heard at today's telecon. The values for typed literals become apparant in a layered model theory. [[ aside - a sketch of the model theory we outlined ... The abstract graph has typed literals that are tidied on the basis of syntactic (string*3) identity. An RDF(S) interpretation maps each typed literal to some value. An RDF(S) interpretation conforms to some datatype d if every typed literal with datatype d is mapped to its value under that datatype. Three natural levels of datatyping are: + XSD - the only interpretations considered are those that conform to all XSD built-in types. + none - interpretations are considered without taken datatype conformance into account + all - interpretations are only considered if they conform with all the datatypes that occur in the graph However, it was noted that these datatyped interpretations are monotonic with respect to the set of datatypes conformed with. i.e. if a entails b with respect to a set D of datatypes and D' is a superset of D then a entails b with respect to D' ]] Given that sort of approach to the semantics, it is helpful for the abstract syntax to identify how a datatype is applied to a typed literal, but it should be clear that such an application is not a syntactic requirement. Surprisingly, that is precisely the text I had in my back pocket! See: http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Oct/0218.html This took the "majority" position - a typed literal is a triple, it defines how a datatype URI might be related to a datatype, and how a typed literal might be related to a value; but then defines equality (explicitly) ignoring those definitions. [[[ Within an RDF graph, a typed literal is a triple: + An RDF URI reference (the datatype URI). + A Unicode [UNICODE] string (the lexical form). + A language identifier The datatype URI refers to a datatype. For XML Schema built-in datatypes, URIs such as <http://www.w3.org/2001/XMLSchema#int> are used. There may be other, implementation dependent, mechanisms by which URIs refer to datatypes. The typed value associated with the typed literal is found by applying the datatype mapping associated with the datatype URI to the lexical form. This mapping fails if the lexical form is not in the lexical space of the datatype associated with the datatype URI. However, the abstract syntax does not presuppose such datatype specific processing. Two typed literals are equal if and only if all of the following hold: + The two datatype URIs compare equal, character by character. + The two lexical forms compare equal, character by character. + The two language identifiers compare equal (case insensitive comparison). ]]] I am inclined to leave it alone. Two issues are: + the dangling langtag is somewhat ugly (but as Pat pointed out that isn't much of an argument against Patrick's practical use cases) + should we try and have a more unified approach to literals. I note in particular TBL's comments on XML Literal ... http://www.w3.org/2002/07/29-rdfcadm-tbl.html [[[ I have to say I have a problem with RDF being tied to always have to have an XML literal as a base type. This breaks layering - and level breaking features should I believe be left for another layer. You should not require any RDF machine to have to include an XML infoset system. The choice of XML syntax was supposed to be an enginering but arbitrary choice. ]]] Given the deployed code using parseType="Literal" and the I18N use cases such as ruby and bidi its a non-starter to try and remove this functionality. But if we had two new types rdf:ClassicLiteral, rdf:ClassicXMLLiteral then we could move all the complexity of XML Literals into a datatype definition. This would address TBLs issue here in that RDF as an abstraction, would be free from the XML base. Disadvantages are: + defining a datatype outside XSD, not a team play + both these datatypes may from the (lexical form, langtag) pair to a value, rather than the XSD convention of mapping from the lexical form (alone) to a value, this would involve knock on effects in our docs. + having to add new terms to the namespace, agree the terms, agree where to put the definition etc. + a peculiar equivalence where rdf:parseType="Literal" and rdf:datatype="&rdf;ClassicXMLLiteral" are sort of synonymous Advantages are: + a unified framework for literals + possibly keeping TBL onside + the treatment of the langtag is more coherent with the decision to keep it in the abstract syntax + might allow further enhanced (non-XSD) datatypes that do good things with the lang tag. I guess if there was some pull from the WG in this direction, I would be inclined to add a note to the doc: [[ Note: the WG is still considering whether to unify the treatment of literals. This would involve regarding all literals as typed literals, and would use two new datattypes (rdf:ClassicLiteral rdf:ClassicXMLLiteral) to correspond to the old String Literal and XML Literal respectively. ]]] Jeremy
Received on Friday, 18 October 2002 16:04:05 UTC