- From: Dan Brickley <danbri@w3.org>
- Date: Thu, 23 May 2002 13:02:47 -0400 (EDT)
- To: Graham Klyne <GK@ninebynine.org>
- cc: "Seaborne, Andy" <Andy_Seaborne@hplb.hpl.hp.com>, <www-rdf-interest@w3.org>
On Thu, 23 May 2002, Graham Klyne wrote: > At 11:53 AM 5/23/02 +0100, Seaborne, Andy wrote: > > > a higher level than "find this pattern of triples" > > > >Agreed. There are two problems that are closely related by sharing > >technology but are different use models. Query-variable bindings is a > >matter of one layer of the application wanting to ask questions of the RDF > >graph ("find the resource such that ...") and the extract subgraph that is a > >matter of RDF->RDF transformation by restricting one graph. These two seem > >to get mixed up. > > Yes, I agree. > (My query implementation doesn't return a subgraph at all, just the > variable bindings.) You can plug the latter into the original query to get the former. Going the other way is more work, you basically have to redo the query against the subgraph to get back to the bindings. The other nice thing about focusing on the bindings is that it is a very familiar (SQL-ish hence 'Squish...') programming idiom. Send some database a query string, get back a bunch of answers, with rows for 'hits' and columns for fields. I'm not sure how far the analogy can be pushed, but Libby had her Inkling/Squish stuff implemented over the JDBC APIs. I'm trying same with Ruby DBI and as a SOAP service (soap-encoding serialization of an array of hashtables; a quick hack). This simple-minded approach to query isn't an ideal/perfect mechanism for querying RDF data services, but its a common, widely implemented subset. Worthy of some writeup and interop testing, I reckon. > > > I'd like to see more work on storage formats before we nail down a query > >language. > > > >This is where I disagree: I don't want to see a relationship between the > >query language and the storage. I think query should be specified in > >relation to the RDF graph. It would be different implementations for > >different application domains that make decisions about storage and query > >*implementation*. There is no need to bind storage choices to QL choices. > > I agree with what you say here, but maybe I should clarify what I meant. I > didn't mean that the query language should be bound to a storage format. > > Rather, I was thinking about the efficiency of higher-level query > constructs; my own implementation is modelled on the idea of matching > tree-shaped query subgraphs against an arbitrary RDF graph. My intuition > here is that this should permit more efficient handling of the > query. Working with a Jena-like interface, the first thing I do to > implement this is break it down into a collection of triples to be matched, > so on that score I don't seem to have made any useful progress. (To set > against that, I was encouraged that the implementation seems to be > constrained to conduct the graph query in much the same way that I would do > if programming it by hand.) That's how I did my first (accidental) query implementation. I first wrote out longhand the Ruby code for calling the triple-match API, then started mechanising it. Not particularly efficient. While query _languages_ don't need to know much about the backend, there are many things about the backend (and its specific contents) that we'll want to expose to query engines. Most obvious case is a backend that is itself capable of handling complex query languages; but also we'll want to know about indices the database might have, stats of various kinds, whether the database is 'smart' w.r.t. datatypes, substring searching, various kinds of inference etc. Lots of possibilities. But I think they can all be interestingly explored in the context of a simple 'graph match, return the bindings' query protocol. And that the query abstraction (basically a graph decorated with variable names) can be distinguished from its various texty representations; eg SQL-like and RDF/XML-based. There are many many other things we might want from an rdf query language, but given the state of tools, proposals etc., my inclination is to go with the simplest, easiest to agree language as a basis for some cross-implementation testing. Dan -- mailto:danbri@w3.org http://www.w3.org/People/DanBri/
Received on Thursday, 23 May 2002 13:02:51 UTC