Re: Every CONSTRUCT is DISTINCT?

Lee Feigenbaum wrote:
> 
> Bijan Parsia wrote:
> [snip]
>>>> Be that as it may, I as an implementor and a user would find it 
>>>> helpful if there were a note pointing out this aspect. I confess 
>>>> that I would never in this lifetime have come up with that reading. 
>>>> So, if it would be possible to add a bit of text somewhere that 
>>>> clarified this point, I think that'd be swell.
>>>
>>> What would it say?
>>
>> "Please note that due to serialization freedom, the serialized results 
>> may contain, syntactically, duplicate triples. There is no way in 
>> SPARQL to force the endpoint to return a syntactically duplicate free 
>> CONSTRUCTed graph."
> 
> Thanks for the suggested text. From my point of view, in the end, this 
> is the editors' decision. As we're wrapping up loose ends, we'll 
> consider it at tomorrow's teleconference. Please feel free to attend if 
> you'd like to speak in favor of including some sort of note. (Either 
> way, we'll cover the issue and make a decision.)
> 
> Personally, I'd quite prefer that the query language draft not begin 
> talking about endpoints right now; it seems way out of scope to me.

Right - it's the protocol spec, if anywhere.  That does with the on-the-wire 
format.

2.1.3 says:

"""
an RDF graph [RDF-Concepts] serialized, for example, in the RDF/XML syntax 
[RDF-Syntax], or an equivalent RDF graph serialization, for SPARQL Query for 
RDF query forms DESCRIBE and CONSTRUCT).
"""
so it refs RDF concepts right after "graph" and there is says

http://www.w3.org/TR/rdf-concepts/#section-data-model
"""
The underlying structure of any expression in RDF is a collection of triples, 
each consisting of a subject, a predicate and an object. A set of such triples 
is called an RDF graph (defined more formally in section 6).
"""

IF anywhere - a note at that point would be possible.  To me (not a protocol 
editor), though,it's just an implementation technique a system has chosen 
because the RDF serializations permit - c.f. HTTP permits compression.

> 
>>> As far as I can see, any confusion about whether to expect duplicates 
>>> or not is really a product of the serialization rather than of the 
>>> query language.
>>
>> I don't see why we can't informatively mention this from the query 
>> language spec. The consequence is that, as implementor, I don't have 
>> to distinct my results before constructing anything. That seems 
>> perfectly relevant in the query document.
> 
> I guess what I don't understand is where you, as an implementor, think 
> the query language spec says that you _do_ have to distinct the results. 
> I guess you're saying that the set-union implies that.
> 
>>> Even the protocol doesn't mandate any particular serialization of an 
>>> RDF graph. If there existed a serialization that prohibited listing 
>>> the same triple twice (are there?), then I'd imagine that it would 
>>> work fine with the protocol as-is.
>>
>> So we can serialize to Turtle? Isn't this a pretty big 
>> interoperability hole?

Turtle does not restrict the serialization to require no duplicates.  Nor 
N-Triples.

> 
> I guess it depends what you mean by interoperability hole? In any case, 
> this dates from: http://www.w3.org/2006/01/10-dawg-minutes#item02 (and 
> then 
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/att-0113/12-dawg-minutes.html#item02 
> )
> 
>>> I'm not saying I object to a bit of (informative) text giving a 
>>> heads-up somewhere... I'm just not sure where it would go and what it 
>>> would say.
>>
>> I would put it right after the passage I quoted. I would put some 
>> wordsmithed version of what I wrote above.
> 
> As I said, thanks. Andy and Eric, what do you think?

As above - serialization is the protocol.  The query language does not cover 
the XML results format either.

	Andy

> 
> Lee
> 
>> Cheers,
>> Bijan.
>>
>>
> 

-- 
  Hewlett-Packard Limited
  Registered Office: Cain Road, Bracknell, Berks RG12 1HN
  Registered No: 690597 England

Received on Monday, 15 October 2007 20:02:21 UTC