- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Sun, 1 Aug 2004 11:45:28 -0400
- To: Jim Hendler <hendler@cs.umd.edu>
- Cc: Jos De_Roo <jos.deroo@agfa.com>, "Seaborne, Andy" <andy.seaborne@hp.com>, public-rdf-dawg@w3.org
- Message-ID: <20040801154528.GC13232@w3.org>
I think I'm looking at DAWG-QL definition in terms of what the user types when trying to solve a problem. You (Jim, not the reader in general) are looking at it in terms of server implementation. What QL definition will work for both? In the cases I've seen, I think it would be optimal if servers implemented a subset of the language. Details inline: On Sat, Jul 31, 2004 at 10:24:00PM -0400, Jim Hendler wrote: > > At 0:55 -0400 7/31/04, Eric Prud'hommeaux wrote: > > [snip] > > >> > >> In short, (i) has difficulties with distribution and (ii) has > >> problems with centralization -- is either of these actually > >> implemented/implementable? Am I misunderstanding the objective?? > > > >(i) has an almost trivial solution when you allow the user to > >select what part of the query goes where. This pretty accurately > >reflects how people do research today, finding pages with one > >sort of information and manually (mentally) merging that with > >data with another sort of information. For instance, I believe > >that the CDDB/IMDB example is a perfectly reasonable model of > >the degreee of expertise we can rely on from today's moderately > >knowledgeable user. > > > > But if the user had to know this, and to send different queries to > different places then, even if I were to interpret the objective such > that that was a solution, I don't see where this would be > advantageous to sending a set of separate queries and then unifying > the results -- in which case wouldn't I be better off having this > under my control instead of making the query language more complex > for no gain? I see the gain for the user. There could be gain for the network efficiency if the server implementation also allowed unification. For instance, W3C has a some RDF data (TR page, ACLs, Annnotea, search results, at-a-glance) that could be merged to answer some useful queries. The client could federate and unify locally or ask the W3C DAWG server to do it, which would save network burden and push the unifcation to a server where it could be optimized. For folks wanting to implement a simple server, they can answer queries that specify targets with "no, do it yourself" (cf. conformance levels [1]). I'm not sure where the sweet point is here. I'm quite sure this is a useful application from the client perspective, pretty sure it would save network traffic, and have a hunch that it's worth the extra definition and implementation. > >(ii) is how most of us do our banal little queries every day. > >Rarely do I see people making the same RDF query over multiple > >repositories. Instead they identify a couple of sources, merge > >them, and do a query across the resulting graph. Most data that > >I've seen seems to be organized such that extra respositories > >complement the data with related data rather than supplementing > >with additional data of the same form. > > > > this might be what people do when things are small, it certainly > won't scale -- but more importantly, it seems to me that forcing the > implementors of a query client to have to implement this is a problem > -- supposing all I want to implement is a web site that queries > various triple stores and displays some sort of page based on the > merged query results -- the 4.5 objective would let me do this well. In the sense that you could invent a new document or service endpoint that would imply a query across these resources. The client won't have a defined way to identify a set of pages (say, Bob and Jill's FOAF pages and a user database) and deduce the name of the service that queries a merge of at least those documents. Making that association would require data published and interpreted in another (higher level) protocol. A higher level protocol could be a usefull way to solve this problem, but it does seem to fly in the face of how most people use RDF today. I'm not convinced that all forms of our QL have to be scalable. I haven't seen that in other QLs and think it alienates a lot of potential users. > The 4.5.1 would both be harder for me to use, and also require that I > know how to manage some triple store for the merged graph -- again, I > may be missing what you are after, but I sure see the objective as it > was written in 4.5 being a whole lot more useful than the one in 4.5.1 > > >I think that (ii) reperesents a big part of what we want people > >to be able to do with the semantic web. (iii) (Aggregate Query) > >can be easily accomplished with SQL today without grounding your > >terms in a global namespace that allows documents to merge. I > >think that the cool thing *is* merging graphs. Yes, that's > >expensive, but I don't think that tne new problems that we want > >to address with the semantic web get solved any other way. > > But didn't objective 4.5 as previously written accomplish most of the > needed capability, without requiring people who want to use the > semantic web to have to become database administrators 4.5 doesn't meet any of the cases I've used to motivate union query or federated query. Executing the same query over multiple sources does not solve most of the queries I see people executing today. Some FOAF queries are easily solved that way (pictures of people with a first name "Bob"), but mostly, I see people merging graphs and doing queries that would not be matched in the graphs individually. I'm speaking from what I've seen. You've seen different use cases. I would like the group to consider what cases they see most often and which style of query (aggregate, union, federated) would work for them. [1] http://www.w3.org/mid/D24D16A6707B0A4B9EF084299CE99B39053F8D0C@mcl-its-exs02.mail.saic.com -- -eric office: +81.466.49.1170 W3C, Keio Research Institute at SFC, Shonan Fujisawa Campus, Keio University, 5322 Endo, Fujisawa, Kanagawa 252-8520 JAPAN +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA cell: +1.857.222.5741 (does not work in Asia) (eric@w3.org) Feel free to forward this message to any list for any purpose other than email address distribution.
Received on Sunday, 1 August 2004 11:45:39 UTC