- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Wed, 24 Mar 2004 11:47:33 +0200
- To: "ext Eric Prud'hommeaux" <eric@w3.org>
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
On Mar 23, 2004, at 13:00, ext Eric Prud'hommeaux wrote: > > On Mon, Mar 22, 2004 at 11:38:33AM +0200, Patrick Stickler wrote: >> (a) expressing queries and query results in RDF > > Expressing queries in RDF: > > In LiteralDB+OWL, (a new name for PS-6 in deference to DanC's naming > comments), you described a scenario where a server could use > owl:sameAs, rdfs:subPropertyOf, owl:<class constraints> to manipulate > the graph of an RDF query *because* it was expressed in RDF. To my > eye, this treads on dangerous territory -- expressing the query as a > simple graph results in it being an answer to the question, It wouldn't be an answer to any question if it isn't syndicated into a knowledge store against which the question is asked (i.e. poor organization/management/engineering can result in lots of "dangers") If the query isn't within the focus of interest, then don't put it into the knowledge base that *is* the focus of interest. > ... > which would give back the rather unsatisfactory bindings: > ?who: <http:...query5#who> > ?first: <http:...query5#first> > ?last: <http:...query5#last> > > By what mechanism could this happen? Maybe queries are stored in a > queue. A triple store compulsively scoops up stuff in the queue and > writes it down. A query engine pops off the queue that look like > queries. It finds this query, asks a lot of resources, ends of getting > back an answer from the compulsive scooper. Like I said, poor organization/management/engineering... If a harvestor is gobbling up stuff with little to no discrimination, and/or if the same queue is being used by a query engine *and* a harvester gathering knowledge that may fall within the focus of executed queries, then that's just poor engineering plain and simple. > > The assertions in that RDF form of the query are not actually > assertions. But they look like assertions so we'd have to keep them > insulated from the RDF world all their life. Sure. And we can have a nice non-normative section of the spec that covers all sorts of "don't do this" scenarios. > (For those notstalgic > about historical spam, "Poor little graph31825 can't leave his > bubble. Please send postcards...") While it may be handy to use OWL > and RDFS inferencing tools, to manipulate RDF-like forms of this > query, I think the risk of graph "assertions" like this is very high. I don't. As long as folks realize that graphs containing queries should not (usually) be mixed with graphs containing general assertions, all will be well. If some folks are careless, sloppy, or ignorant and merge such graphs then strangeness could result (though I'm not convinced that anything bad would actually happen, just that query results could be of degraded utility). A query graph is essentially a claim. There exist some target resources which have certain characteristics, etc. and execution of the query is figuring out how to make the claims true, and providing all the evidence. So if you merge a query graph with your main knowledge base, the claims are still valid -- you're saying "some" resource exists that has... -- yet since all you have are bnodes, you don't know exactly which one it is, and any query executed against those query-based claims would be *true*, just not very informative. > > An alternative would be to reify the query, Ugh. Please no. And there's no need. A query expressed in RDF is making certain claims. And those claims are true no matter what other graphs they get syndicated into. Whether you *should* syndicate those claims into other graphs is the real issue here, not the fact that the query is expressed in RDF. Indiscriminate syndication will always lead to headaches. Be careful what you eat! > > Expressing results in RDF: > > It is not necessary that RDF query results be expressed as statements > for query federation. Never said it was necessary, only that it was very useful, because from start to finish an agent is able to work with RDF graphs rather than multiple serializations. > Let's look at a fairly flushed out federation > scenario. > > federatedAnnotFoaf: > Client query: the name and email addrs of everyone who has created > Annotea > annotations: > > ?annot dc:created ?when > ?annot dc:creator ?who > ?who a:Email ?email > ?who foaf:givenName ?first > ?who foaf:surname ?last > > We send this query to http://www.w3.org/?DAWG and it break the query > up into the pieces that it knows there is an agent to handle. We've now dipped below the specifics of what the DAWG spec would define, so everthing up to XXX below is now out of scope... > It sends > > ask(?annot dc:created ?when > ?annot dc:creator ?who > ?who a:Email ?email) > collect (?email ?when) > > to the Annotea server. The server gives back a list of email addres > and dates those accounts created annotations. (Annotea account names > are email addresses.) Let's assume first entry in this list is > mailto:joe@example.com . > > The query federator knows that a:Email and foaf:mbox have ranges of > the same data type (may 'cause one is a subPropertyOf the other) and > knows (maybe some heuristic based on a service advertisement) that a > foaf server is more likely to know foaf:mbox. > > For each of the email addresses that came back from the Annotea > server, the unifier composes a new query that it sends to a foaf > server: > > ?who a:Email <mailto:joe@example.com> > ?who foaf:givenName ?first > ?who foaf:surname ?last > > The server gives back all the combinations of first and last name for > joe@example.com (probably 1, modula some problems spelling Joe > Lambda's name). > The federator of the query drops these results into > the bindings table, eliminating or duplicating rows when the number of > results is not 1: > > date email > 20040311 mailto:joe@example.com > 20040309 mailto:bob@example.com > ... ... > > becomes > > date email first last > 20040311 mailto:joe@example.com Joe Lamda > 20040311 mailto:joe@example.com Joe Lambda > 20040309 mailto:bob@example.com Bob Robertson > ... ... XXX At which point, the original federator recieving the original query returns the final set of bindings -- which could just as well be expressed in RDF using the Result Set Vocabulary, so that the requesting agent need not have to parse yet another serialization. Thus, for DAWG to specify that variable bindings (if such are requested) be communicated in query results as RDF does in no way prevent or even complicate any of the above scenario you present above. > > This can continue down through as many levels of federator/proxy as > were involved in delivering the query. Every agent involved, including > the client's, has the capacity to *extract* a graph given the query > that it originally say and a set of bindings. This can provide the RDF > analog of relatoinal closure. And at each level, the same could be achieved if those bindings were expressed in RDF rather than some other encoding. Sorry, I fail to see any issues with expressing bindings in RDF in the scenario you are presenting. Specifically, how does returning the equivalent of > date email first last > 20040311 mailto:joe@example.com Joe Lamda > 20040311 mailto:joe@example.com Joe Lambda > 20040309 mailto:bob@example.com Bob Robertson > ... ... expressed in RDF using something akin to the Result Set Vocabulary cause you problems or in any way preventing you from doing what you have described above? Patrick -- Patrick Stickler Nokia, Finland patrick.stickler@nokia.com
Received on Wednesday, 24 March 2004 05:02:39 UTC