Re: ISSUE-12: xs:string VS plain literals: proposed resolution from Andy Seaborne on 2011-05-07 (public-rdf-wg@w3.org from May 2011)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Sat, 07 May 2011 14:19:07 +0100
To: public-rdf-wg@w3.org
Message-ID: <4DC546CB.8000905@epimorphics.com>
On 07/05/11 05:39, Pat Hayes wrote:
>
> On May 6, 2011, at 9:18 PM, Eric Prud'hommeaux wrote:
>
>> * Pat Hayes<phayes@ihmc.us>  [2011-05-06 18:26-0500]
>>>
>>> On May 6, 2011, at 12:11 PM, Andy Seaborne wrote:
>>>
>>>> It was Sandro who introduced SPARQL into the thread.  I don't
>>>> agree that its a "grave mistake" in SPARQL.  Treatment should
>>>> be uniform whether using SPARQL or some other way of accessing
>>>> the data (SPARQL engines are often written over a base API
>>>> anyway).
>>>>
>>>> The proposed text is: """ Recommend that data publishers use
>>>> plain literals instead of xs:string typed literals and tell
>>>> systems to silently convert xs:string literals to plain
>>>> literals without language tag """
>>>>
>>>> This is an RDF-as-data view; this is not D-entailment.
>>>
>>> But my understanding is that the main (only?) reason for this
>>> suggestion is to make RDF data more accessible to SPARQL
>>> querying, because at present a query has to be couched in both
>>> forms in order to find both kinds of literal. If there is any
>>> other reason for this suggestion (which runs directly counter to
>>> all the thinking and discussion and advice that has so far been
>>> published on this topic since 2004) then I would like to see it
>>> spelled out in detail.

I agree it's need more detail - that's what I am exploring.

>>> And we should actively request input from
>>> OWL 2 and RIF representatives before making this recommendation.
>>>
>>> If this is the primary reason for this suggestion, then my point
>>> is that this effect - of having one query find both kinds of
>>> literal as answers - can be achieved by SPARQL using
>>> {xsd:string}-entailment rather than simple entailment. And,
>>> further, that if this is the only reason for this suggestion,
>>> that this is business for the SPARQL WG to consider rather than
>>> us. I do not believe that it appropriate for us to recommend that
>>> people write their RDF graphs in a certain way, unless we have
>>> very strong reasons for this and can articulate them clearly (and
>>> then also explain why we did not alter RDF to make this
>>> suggestion mandatory, if the reasons are so strong.)
>>
>> SPARQL happens to use graph equivalence to establish a viable set
>> of variables bindings for a graph pattern, but I don't expect that
>> any of us think it will be the last tool to use graph equivalence.
>> What's important is that RDF provide the core property of
>> equivalence so that SPARQL, OWL 7, Revenge of RIF 2, etc. all work
>> with the same model (otherwise implementing something which e.g.
>> executes SPARQL queries over closures of OWL and RIF inference will
>> be unpleasantly fuzzy). SPARQL happens to play the canary as it
>> it's the easiest way for us to test precise graph equivalence.
>
> OK, but (to make the same point in a slightly larger context), RDF
> already does provide this notion of equivalence. It is called
> {xsd:string}-entailment, and it is fully and thoroughly documented in
> the existing RDF specs. Two graphs are equivalent in the required
> sense when they {xsd:string}-entail each other.
>
> (BTW, SPARQL isn't going to work properly over mere closures in any
> case, even for OWL 1, let along OWL 7 :-)
>
>>>> It is not necessarily a change to SPARQL query, which has to
>>>> work with old and new data.
>>>>
>>>> :x :p "foo" .
 >>>> :x :p "foo"^^xsd:string .
>>>>
>>>> One triple or two? The proposal says (ideally) one.
>>>
>>> Actually, the proposal as written does not say this. This is
>>> definitely two literals. The proposal would rewrite this graph to
>>> one with a single literal, but it would not be the same graph.
>>
>> I think we're best off retroactively saying that every graph that
>> looks like this has only one triple.
>
> We can say this all we want, but saying it does not make it true.
> Right now, it is false. Those are two triples. If you want this to be
> one triple, you need to explain how to rewrite RDF Concepts to make
> it come out that way. Good luck.
>
> Pat
>
>> Telling the world that "abc"^^xsd:string is a deprecated form of
>> "abc" (and systems are encouraged to normalize) is probably the
>> best balance between simplification and disruption.
>>
>>
>>
>>> Pat

To clarify my example:

That's not a graph - it's a serialization.  My question is what graph 
does it produce when parsed given the proposed text.

If it were:

:x :p "foo" .
:x :p "foo" .

then it produces a graph with one triple.

:x :p "foo" .
:x :p "foo"^^xsd:string .

Is the effect of "and tell systems to silently convert xs:string 
literals to plain literals without language tag" supposed to cause one 
or two triples?

As someone who couldn't make the F2F (so all I see is the text), I'm 
trying to understand what exactly was meant here.  Before we discuss 
{xsd:string}-entailment, we need to know the starting graph.

	Andy
Received on Saturday, 7 May 2011 13:19:39 UTC