- From: Bijan Parsia <bparsia@cs.man.ac.uk>
- Date: Mon, 15 Oct 2007 20:04:59 +0100
- To: Lee Feigenbaum <lee@thefigtrees.net>
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
On 15 Oct 2007, at 19:46, Lee Feigenbaum wrote: > Bijan Parsia wrote: >> On 15 Oct 2007, at 15:49, Lee Feigenbaum wrote: [snip] >> We discussed this on IRC and this is a clever bit of spec reading. >> It does then highlight the need for a CONSTRUCT DISTINCT. > > Hmm, I don't see why... The spec. defines CONSTRUCT and SELECT in > terms of the mathematical (for lack of a better word) results - in > CONSTRUCT's case it's a set of triples and in SELECT's result it's > a solution sequence. The only time the query language spec. refers > to serializaiton is in an informative example of RDF/XML results > and in references to the SPARQL Query Results XML Format. That doesn't mean that it couldn't. Frankly, I'm no where nearly as blase as you about treating this as merely a serialization issue. I concede it can be treated that way. However, it's not like implementations produce a graph in some internal representation, then say, "oh what the heck, let's insert some dups". I believe they are streaming out the answers and it's exactly analogous to streaming out xml results. The dups stem from dups in the results, not from artifacts of the serialization. >> Be that as it may, I as an implementor and a user would find it >> helpful if there were a note pointing out this aspect. I confess >> that I would never in this lifetime have come up with that >> reading. So, if it would be possible to add a bit of text >> somewhere that clarified this point, I think that'd be swell. > > What would it say? "Please note that due to serialization freedom, the serialized results may contain, syntactically, duplicate triples. There is no way in SPARQL to force the endpoint to return a syntactically duplicate free CONSTRUCTed graph." > As far as I can see, any confusion about whether to expect > duplicates or not is really a product of the serialization rather > than of the query language. I don't see why we can't informatively mention this from the query language spec. The consequence is that, as implementor, I don't have to distinct my results before constructing anything. That seems perfectly relevant in the query document. > Even the protocol doesn't mandate any particular serialization of > an RDF graph. If there existed a serialization that prohibited > listing the same triple twice (are there?), then I'd imagine that > it would work fine with the protocol as-is. So we can serialize to Turtle? Isn't this a pretty big interoperability hole? > I'm not saying I object to a bit of (informative) text giving a > heads-up somewhere... I'm just not sure where it would go and what > it would say. I would put it right after the passage I quoted. I would put some wordsmithed version of what I wrote above. Cheers, Bijan.
Received on Monday, 15 October 2007 19:03:47 UTC