- From: Boris Motik <boris.motik@comlab.ox.ac.uk>
- Date: Sat, 16 May 2009 18:01:58 +0200
- To: <public-rdf-text@w3.org>
Hello, Andy has asked if I could summarize the proposal for resolving the issues surrounding rdf:text, so here it is. rdf:text (as opposed to rdfs:Literal) complies with the definitions of datatype both the sense of XML schema and RDF. That is, it has a clearly defined - lexical space, - value space, and - a lexical-to-value mapping. It seems that we can resolve most (all?) of the issues from the LC comment by the SPARQL WG if we changed rdf:text to just a normal datatype whose value space "coincitdentially" overlaps with the value space of xs:string, and whose typed literals are equivalent to plain RDF literals under *D-entailment*. In this way, 1) we do not affect SPARQL implementations that rely on simple entailment, and 2) we do not affect SPARQL implementations that rely on D-entailment but that do not know of rdf:text (i.e., that do not have rdf:text in their datatype map). Several observations are important here: * It is true that the lexical forms of rdf:text and xs:string (partially) overlap, and that the same lexical form is typically assigned different values. Consider, for example, the following literals and their associated data values: (1) "Hello@"^^xs:string ==> the string "Hello@" (2) "Hello@"^^rdf:text ==> the string "Hello" Thus, despite the fact that the lexical forms are the same, the literals are mapped to different data values. Note, however, that such a situation already exists in existing datatypes. For example, "a"^^xs:hexBinary and "a"^^xs:base64Binary are mapped to different data values, and so is the case for "1"^^xs:float and "1"^^xs:integer. * It is true that different lexical forms may be assigned the same data values. Consider, for example, the following literals and their associated data values: (3) "Hello"^^xs:string ==> the string "Hello" (4) "Hello@"^^rdf:text ==> the string "Hello" This is important for applications that want to use SPARQL with D-entailment. The problems of rdf:text, however, are not unique, and exist in other datatypes. For example, the following literals have distinct lexical forms, but are assigned the same data value: (5) "1.0"^^xs:decimal ==> the integer 1 (6) "1"^^xs:integer ==> the integer 1 If this causes problems for the definition of SPARQL's built-in functions, such problems are not cause by rdf:text. Rather, such problems are caused by the fact that the behavior of these functions might be unclear when used in a D-entailment regime. Such problems should be dealt with by the SPARQL WG (or any subset of it, such as SPARQL/OWL, who is interested in defining a proper D-entailment regime for SPARQL. As an rdf:text editor, however, I do not believe that rdf:text cause any additional problems; that is, any problems are due to rdf:text, these are also due to other XML Schema datatypes as well. * The built-in functions STR, DATATYPE, and LANG operate on the lexical forms only, so there is no problem: they should be evaluated as specified in the present specification. Hence, we have the following behavior, which is a consequence of the fact that rdf:text is just another datatype. STR("Hello@"^^xs:string)= STR("Hello@"^^rdf:text) = "Hello@" STR("Hello@en")= STR("Hello@en"^^rdf:text)= STR("Hello@en"^^xs:string)= "Hello"@en" DATATYPE("Hello@en"^^xs:string)= xs:string DATATYPE("Hello@en"^^rdf:text)= rdf:text DATATYPE("Hello@en")= xs:string DATATYPE("Hello"@en)= error LANG("Hello"@en) = "en" LANG("Hello@en") = LANG("Hello@en"^^rdf:text) = LANG("Hello@en"^^xs:string)= "" * In D-entailment respecting rdf:text, any triple containing a literal "Hello"@en would also entail a triple containing "Hello@en"^^rdf:text, and vice versa. Similarly, and any triple containing a literal "Hello@en" would also entail a triple "Hello@en@"^^rdf:text and a triple containing a literal "Hello@en"^^xs:string. That is completely analogous to the case of, say, xs:integer and xs:decimal: any triple containing "1"^^xs:integer, should entail a triple containing the literal "1.0"^^xs:decimal, and so on. Again, rdf:text introduces no additional problems; furthermore, if there are any actual problems, these should be resolved by defining a proper D-entailment regime for SPARQL. Without trying to making any preconceptions about the definition of a D-entailment regime for SPARQL, a possible D-entailment could work such that he scoping graph for a BGP includes all forms of the relevant literals. For example, the scoping graph for a BGP :s :p "Hello". would be :s :p "Hello@"^^rdf:text. :s :p "Hello". Please note, however, that this is again independent from rdf:text per se, and that similar problems arise with other XML Schema datatypes, as outlined earlier in this e-mail. As a consequence, I believe that the LC comment of the SPARQL WG should be addressed by simply removing any mention of literal replacement during graph exchange. This makes it clear that rdf:text is just another, regular datatype that is in no way different from the other XML Schema or user-defined datatypes. Regards, Boris
Received on Saturday, 16 May 2009 16:03:29 UTC