- From: Graham Klyne <GK@ninebynine.org>
- Date: Wed, 13 Oct 2004 16:41:47 +0100
- To: public-rdf-dawg-comments@w3.org
At 15:08 13/10/04 +0100, Seaborne, Andy wrote: >The RDF Data Access Working Group is happy to announce the first working >draft of the query language part of its work: > > SPARQL RDF query language > http://www.w3.org/TR/rdf-sparql-query/ > >The Working Group is soliciting feedback on this early draft ... On first glance, it's looking good to me. Here are some random thoughts: ... 1. Is the SELECT clause really useful? My implementations return all variable bindings from the query, and I simply ignore those I don't want. ... 2. In section 2.2: "Not every binding needs to exist in every row of the table.". I think this is an important feature whose presence should be very clear. Currently, it seems a bit buried. ... 3. I think the terminology around "Definition: Triple Pattern Matching" is a bit muddled. Is a "binding" a substitution for a *single* variable, or a tuple of variables? (I think you mean the former) I think it's important to be very clear about this, and have clear terms corresponding to: (a) a single name->value binding (a "cell") (b) a tuple of name->value bindings, with no name repeated (a "row") (c) a set of tuples of name->value bindings, (with no tuple repeated under permutation?) (a "table") These are distinctions I've found to be important to keep clear in my implementation work. ... 4. In section 2.2: "If the same variable name is used more than once in a pattern then, within each solution to the query, the variable has the same value." This, too, I think is important to keep clearly stated. ... 5. I note that variables are allowed in predicate position. If this doesn't present any problems, I'm all in favout of this, but I think the design decision could be highlighted more clearly. ... 6. Can the resulting variable bindings contain repeated binding-tuples; e.g. in response to a query like: SELECT ?a ?c WHERE ( ?a ?b ?c ) against the graph: :s1 :p1 :o1 . :s1 :p2 :o1 . Later, you mention that a query result is a set, so I guess that means no duplicates, but I haven't yet seen this stated more explicitly. Later, you introduce SELECT DISTINCT, so I guess that means a simple query result can have duplicate binding-tuples. So it's not a set. ... 7. Section 4 I note you've chosen to allow optional elements of graph patterns, but not alternatives. In one of my implementations I provided alternative blocks, where the last alternative could be empty, hence also providing optional patterns. Alternatives are permitted to bind the same variable, thus providing ways to match different (graph-syntactical) expressions of the same information. I have sometimes found this to be useful, but it does somewhat mess up the clean semantics of the approach you have adopted. Despite the semantic messiness, I do feel that having some capability to select one possible match over another, when dealing with possibly messy real-world data, could be useful enough to justify the consequent complication of query optimization when such a feature is used. ... 8. Section 8 The current position seems about right to me. Complicating the basic query mechanism to handle "accessing direct subclass relationship" seems undesirable and unnecessarily: presenting a graph with (notional) explicit types (etc.) where implied by subclass relationships seems to me to be sufficient. ... Section 9. Constraining the source of a pattern seems to be only a (small) part of the provenance story. Is it not also desirable to query the source. Oops! I now see that <source> can be a variable. OK, that's neat, and works cleanly at the natural unit of provenance, viz the statement. Is it fair to assume that support for SOURCE may be optional? Ah yes, if unsupported, bind source variables to NULL. If a statement occurs in more than one source with a source variable pattern, does that result in multiple variable-binding-tuples? (I think it should.) e.g. the pattern: SOURCE ?ppd ( ?whom foaf:age ?age ) might return :source1 :Jenny foaf:age "10" :source2 :Jenny foaf:age "10" :source3 :Jenny foaf:age "11" etc. ... Section 11 I think this might better be titled "result forms". Is it intended that every SPARQL must support every result form? I think that could add unnecessary implementation complexity. I think there should be one form supported by all implementations, and SELECT seems a reasonable choice. I don't really see a compelling case for requiring the the others to be universally available. I think the ASK result form is also reasonable. Thought: if a query pattern has no variables, is there a distinction for SELECT * result when the query is matched or not matched. I think there should be: {} query not matched. {<>} query matched, empty variable binding tuple. ... Section 11.3 I'm uneasy about the DESCRIBE feature. It seems to be going rather beyond the basic idea of RDF graph query, and doen's seem to have well or clearly defined semantics. I think the effort here might be better applied to query language extensions that permit some kind of recursively-defined pattern, so that various kinds of sub-graph neighbourhoods can be described according to an applications requirements. A simple use-case would be to describe the entire content of an rdf:collection from just its head element. ... Section 12. Testing values. Is there a way to combine tests with non-struct evaluation, so that something like: AND isBound ?x AND ?x < 20 can be reliably processed? ... Section 12, "Are tests syntax for RDF predicates or separate concepts?" This makes me uneasy. I feel that there may be tests that are not easily or naturally presented as RDF syntax. Probably with enough contorion it can be managed, but is it helpful? How does a test like "isBound ?x" play here? Part of my viewpoint here is that there should be, as far as possible, a clear separation between structure within RDF literal values and structure that is expressed within the RDF graph. (For this reason, I'm not enthusiastic about using XML schema structured datatypes as RDF literals, when the structure over the component values could be quite naturally expressed using RDF statements. This leads me to think that the query language tests here should really be trying to capture those things that aren't comfortable captured as RDF properties.) ... That's it, for now. #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact
Received on Wednesday, 13 October 2004 16:03:01 UTC