Re: Every CONSTRUCT is DISTINCT?

Bijan Parsia wrote:
> 
> On 15 Oct 2007, at 15:49, Lee Feigenbaum wrote:
> 
>> Bijan Parsia wrote:
>>> [I preface this with a "I don't want to delay anything even a little 
>>> bit!" comment. I don't!]
>>> <http://chatlogs.planetrdf.com/swig/2007-10-15#T14-35-05>
>>> It seems to me that CONSTRUCT implicitly DISTINCTs. Is this true? It 
>>> seems to me that there is room for a CONSTRUCT that had duplicate 
>>> triples in it (for the usual reasons of streamability).
>>> If there was I discussion about this point, and it's easy for someone 
>>> to dig out, I would appreciate a pointer.
>>
>> CONSTRUCT returns a graph. Whether the representation/serialization of 
>> the graph contains duplicate triples is irrelevant to the spec's 
>> concern, as far as I know. That is, my implementation can return:
>> :s :p :o .
>>
>> or it can return:
>> :s :p :o .
>> :s :p :o .
>> :s :p :o .
>> :s :p :o .
>>
>> ...and it's returning the same graph, and therefore it's returning the 
>> same results. (Both are representations of the same set of triples.)
>>
>> I'm sure someone else will correct me if I'm wrong.
> 
> We discussed this on IRC and this is a clever bit of spec reading. It 
> does then highlight the need for a CONSTRUCT DISTINCT.

Hmm, I don't see why... The spec. defines CONSTRUCT and SELECT in terms 
of the mathematical (for lack of a better word) results - in CONSTRUCT's 
case it's a set of triples and in SELECT's result it's a solution 
sequence. The only time the query language spec. refers to serializaiton 
is in an informative example of RDF/XML results and in references to the 
SPARQL Query Results XML Format.

> Be that as it may, I as an implementor and a user would find it helpful 
> if there were a note pointing out this aspect. I confess that I would 
> never in this lifetime have come up with that reading. So, if it would 
> be possible to add a bit of text somewhere that clarified this point, I 
> think that'd be swell.

What would it say? As far as I can see, any confusion about whether to 
expect duplicates or not is really a product of the serialization rather 
than of the query language. Even the protocol doesn't mandate any 
particular serialization of an RDF graph. If there existed a 
serialization that prohibited listing the same triple twice (are 
there?), then I'd imagine that it would work fine with the protocol as-is.

I'm not saying I object to a bit of (informative) text giving a heads-up 
somewhere... I'm just not sure where it would go and what it would say.

Lee

> Cheers,
> Bijan.
> 
> 

Received on Monday, 15 October 2007 18:47:08 UTC