- From: Bijan Parsia <bparsia@cs.man.ac.uk>
- Date: Sat, 19 Aug 2006 17:50:01 +0100
- To: Bijan Parsia <bparsia@cs.man.ac.uk>
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
I have just found and read (or reread): http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/ 0008.html Where Fred has anticipated much of this debate and made arguments in favor of both interpretations, and suggests having distinct keywords. I can certainly live with that, as I've stated before, but I wanted to point something out: """Note that adding the part number to the SELECT list will not necessarily save the query, since the combination of part number, quantity and price is still not a guaranteed unique key for line items. The user is relying on distinct blank nodes to represent distinct line items. Of course, from the point of view of "RDF Semantics" that would be a redundant graph, for example, one that asserts "There exists a line item whose part is XYZ, quantity is 1 and price is 10.99" and asserts again "There exists a line item whose part is XYZ, quantity is 1 and price is 10.99". Thus one could say that this is a misuse of RDF. This may be technically true, but I wonder if insisting on this point will really serve the users. If you read the RDF Primer, the application design above makes sense. You have a line item; you don't want to bother creating an IRI for each line item; so you make a blank node for each line item. "RDF Semantics", on the other hand, is a dense document with talk about hypothetical universes that are interpretations of a graph. This is not the kind of material that will make its way into seminars, courses, how-to books, etc.""" I believe the RDF Primer did a dis-service by encouraging this misunderstanding. I think we should encourage people to create IRIs in these circumstances. Even if we allow for these distinctions in answer sets, we cannot enforce that for RDF graphs in general. Given the prevalence of "use RDF for representing data" and the existence of "CONSTRUCT" it would be reasonable for a user to think that CONSTRUCT and SELECT will bear certain relations to each other. But if a tool, somewhere, decides to lean the graph (which is semantically safe from an RDF point of view) it will violate the user's modeling expectations. They are making unfounded assumptions, of course, but that's a cold comfort. This is why I think what Fred called constructive semantics is potentially seriously misleading, even if it is the generally better choice (for RDF and for SPARQL; note that Fred's point is that people *modeled* things a certain way with certain expectations; also note that this isn't the lean graph case, so points to a third family of meanings for DISTINCT). My general position is that we are doing an RDF query language, so we need to at least make the semantics of RDF available and transparent to the user (I acknowledge that Pat has an argument that if you scope the bnodes to the entire sparql expression, that his DISTINCT is consistent with the semantics of RDF; I still think it's less *transparent*, but I shall address that in another post). So I support including the existential reading. I'm becoming more and more convinced that making that the only reading would be, in the long run, beneficial to users, along the lines of the strictness of XML parsers with regard to well formedness. However, I'm still undecided on that point, so am still amenable to having multiple readings, esp. wrt DISTINCT. Cheers, Bijan.
Received on Saturday, 19 August 2006 16:50:12 UTC