W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2007

Re: Every CONSTRUCT is DISTINCT?

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Mon, 15 Oct 2007 20:43:31 +0100
Message-Id: <21C6924D-52B0-4A0C-A1E8-804720EBCD8B@cs.man.ac.uk>
Cc: Lee Feigenbaum <lee@thefigtrees.net>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
To: "Seaborne, Andy" <andy.seaborne@hp.com>

On 15 Oct 2007, at 20:29, Seaborne, Andy wrote:

[snip]
> I was confused by that exchange on IRC:

Ok.

> http://www.w3.org/TR/rdf-concepts/#section-data-model says:
>
> "A set of such triples is called an RDF graph"
>
> The result of CONSTRUCT is an RDF graph.
>
> The serializations of RDF allow multiple occurrences of a triple -  
> it's convenient sometimes; it can even be very hard for say, GRDDL,  
> to transform to a set of triples.

Granted.

> This isn't spec weaselling.

Well, I didn't mean that as pejorative.

I think the reason I was confused (first thinking that dups were  
*definitely* in, and then thinking they were *definitely* out) is  
that it's pretty easy to think of a CONSTRUCT as being the result of  
XSLTing a SELECT result (or otherwise as concating a template filled  
from a result set). From this POV, CONSTRUCT DISTINCT makes sense, as  
does never doing that, but it's not obvious why some implementations  
do and some don't.

>   Duplicates happen in RDF serializations anyway.

Of course. But usually they are a consequence of some process  
incidental to the serialization, per se. In this case, my belief is  
that it's a consequence of not distincting the results.

> The system in question is merely making use of that feature (which  
> is nothing to do with SPARQL)

It has something to do with SPARQL, at least in that it's pretty easy  
to see why people might expect certain behavior from a SPARQL engine  
and it wasn't obvious to me why alternative, serialization specific,  
behavior wasn't specified.

> for specific performance goals.

Sure.

>   If some system streams with duplicates and users don't like that,  
> discuss it with the system developers.  They have their reasons for  
> their implementation; there's a discussion to be had between user  
> and developer.

I've already granted that this is fully specified. It's not the  
specification I prefer, but that's fine at this date. But I was, in  
fact, confused by this point when I went to the spec. Some  
informative text along the lines I've proposed at the point I propsed  
would have helped me even if it's not, in your and Lee's view,  
strictly related.

Whether to help developers like me in this particular way is entirely  
up to you in your editorial role. I'm more than satisfied that this  
point has been heard, and, of course, my own confusion has been  
cleared up, so I'm happy.

> If you ask the SPARQL query "{ :s :p :o }" on the CONSTRUCT  
> results , there is zero or one matches. Two or more would be wrong.  
> If your RDF system reveals duplicates, you need to file a bug  
> report with the developers.

I suspect the original user was concerned by transmission issues, but  
I really don't know.

Cheers,
Bijan.
Received on Monday, 15 October 2007 19:42:12 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:37 GMT