RE: Various result forms from Seaborne, Andy on 2004-05-06 (public-rdf-dawg@w3.org from April to June 2004)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Thu, 6 May 2004 11:21:27 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-ID: <E864E95CB35C1C46B72FEA0626A2E808028A34D8@0-mail-br1.hpl.hp.com>
-------- Original Message --------
> From: public-rdf-dawg-request@w3.org <>
> Date: 5 May 2004 19:31
> 
> 	To try to put requirements 3.2 (Variable Binding
> Results) and 3.5 (Subgraph
> 	Results) in some kind of context ...
> 
> 
> Comments added below from DQL perspective.
> 
> 
> 	There are a number of result forms that people have
> used or suggested:  I
> 	know of:
> 
> 	1/ Variable bindings
> 	   Common for the case of get data out of RDF
> 
> 
> Obtained by answer pattern in form of a vector of variables.
> Could be done by using XML with <qvar /> tags.
> 
> 
> 	2/ Result set in RDF
> 	   As 1/ - encoded in RDF
> 
> 
> Obtained by answer pattern in form of some lexicalization of
> RDF with variables inserted. Could be done with RDF/XML with extra tags
> for variables. 
> 
> 
> 	3/ RDF=>XHTML/XML
> 
> 	4/ Subgraph extraction:
> 	   Actually, two forms:
> 	4a/ The query pattern with variables substituted for each
> 
> 	    solutions and result merged.
> 
> 
> answer pattern = query pattern (default in DQL)
> 
> 
> 	 Reexecuting the query
> 	    gives the same results.
> 
> 
> Well, not necessarily. Even assuming the source graph is
> stable between queries, this would now be a yes/no query
> (without variables) , so cannot strictly produce the same result.

I don't follow.  The query has variables in it that would be bound against
the extracted subgraph.

I have seen request from users for this even in the local case - they are
using a query to define a regionof the graph in the absense of a betetr
region definition lanaguge.  The extract sub graph is passed onto some other
RDF processing.


> 
> 
> 	4b/ The triples from the graph that were matched.
> 	    Would include, for example, the subclass resource
> 	    when asked (?x rdf:type x:superclass).
> 
> 
> Strictly speaking, subclass inheritance reasoning is part of
> RDFS, not RDF. This raises an issue: are we talking here
> about RDF or RDF+RDFS ? We need to get quite clear on this,

We do - I stand corrected.  I was thinking RDF+RDFS.

> seems to me, since if we expect this to be 'open-ended' in
> what counts as suitable backup inference for an RDF query,
> then RDF querying is indistinguishable from OWL querying.
> 
> One of the issues we punted on when designing DQL was whether
> it made sense to return the 'justification' of an answer. In
> general, this can get *extremely* complicated: the only
> general form has to be something like a proof or derivation
> of the answer from the Kbase, which can get arbitrarily long
> and hairy, and for some inference engines (eg tableaux
> reasoners) may not even be well-defined. Anyway, this is
> clearly a research issue. I'd strongly suggest that we back
> off from this as a requirement, if we expect to finish the WG
> work this century.  It seems easy for RDF only because RDF
> itself is so simple, particularly semantically: but it starts
> getting hairy almost immediately. Even allowing typed
> literals makes things complicated, and when you get things
> like subPropertyOf in the mix, the conclusion chains get
> tricky to follow, eg see the examples in the text at
> http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#RDFSRules.
> 
> 
> 	4a == 4b for an RDF graph with no inference processing.
> 
> 
> But what does that mean? The graph is just what it is, the
> inference comes from the semantics you assume when you look
> at it through RDFS or OWL (or whatever) colored glasses.
> 
> 
> 	5/ RDF => RDF
> 	   Templating - a generalisation of 4/ where a template (RDF graph
with
> 	variables in it) is used to create new RDF at the
> server.  At the F2F this
> 	was voted against as a requirement.
> 
> 
> I wish I had been at the F2F, as this sounds like the DQL
> idea of an answer pattern. (Though we didnt require it to be
> RDF, and I see no reason why it should be in particular: it
> should just be a string with some variables in it, or maybe
> some XML with <qvar> tags in it.)

The requirements at the F2F were things we saw a necessary requirements.
That is not to say that templating can't be done.  I'm in favour - but I
would not be against a proposoal just because it didn't have templating.
SeRQL has it, cwm has it so there is other implementation experience.  I
would also say that some of our use cases could make use of it where a query
is just one step in an RDF-driven application.

> 
> Why do we need to specify the answer format? If we allow
> users to specify it (and maybe provide a handy default for
> lazy users or beginners) then we avoid quarrels and add to
> interoperability.

Providing a default is much the same in practice as specifying the format.
 
> 
> 
> 	Getting information out of RDF directly is 1,2,3.  Part
> of a larger
> 	processing system (distributed) is 4 & 5.
> 
> 
> 	1/ is about the problem of getting information (node
> and arc labels) out of
> 	RDF; 2/ is A way of recording 1/.  I have used 2/ to
> give access to query
> 	languages from (other language) toolkits that have no
> query capability.  I
> 	execute the query remotely (all it takes is to pass a
> string from
> 	application to server - the client toolkit does not
> need to understand the
> 	QL
> 
> 
> Yes, quite.
> 
> 
> 	) and use the result set format [1] as the transfer
> syntax.  That's
> 
> 	convenient because the client toolkit can parse and
> work with the returned
> 	RDF.  Having the client requirements simple can, for
> small devices, take
> 	many forms - this is one of them.
> 
> 
> 	3/ is important for the display of information directly
> from RDF sources.
> 	Using XQuery/XSLT/etc looks to be useful (practical,
> utilizes programmer
> 	skills, builds on existing work, what people expect, ...).
> 
> 
> Right. Seems to me that we could reasonably require that
> answer patterns are legal XML , and must use our specified
> markup for our variables. Then users can easily test for
> unbound variables in an answer, for example, but this allows
> answers to be formatted as RDF/XML or as something close to
> plain text strings, or anything in between. It would also
> allow answers to be things like XHTML as well as RSS.
> 
> 
> 
> 	4 & 5 are about getting some RDF out of another
> (larger, remote) RDF
> 	dataset.  The results would be further manipulated
> before going to the user,
> 	and that includes passing in on to other machines where
> the final
> 	destination of user/application is not the one making
> the query; instead the
> 	extracted subgraph is sent on to other places. This is RDF=>RDF, for
> 	example, passing around RSS entries.  The general
> requirement is that part
> 	of a large, remote target graph is extracted and
> deliver for further, local
> 	processing.
> 
> 	For 4/, examples include the "tell me about" queries
> and the use of the
> 	pattern of query to define the subgraph.  In fact, 4a
> gives an alternative
> 	way of approaching the example above if the client
> toolkit does have the QL.
> 	Re-execution is much, much cheaper, essentially as
> there are so few negative
> 	search branches to follow.
> 
> 
> But why would one ever need to re-execute, if you already
> have the answer to the query? (what am I missing here?)

It isn't a terrible important usage I agree.  But I have seen examples of
people using query to extract a region of the graph for passing else where
so the subgraph might used twice - once for bindings and to pass else where.
Hardly important.

	Andy

> 
> Pat
> 
> 
> 
> 
> 	        Andy
> 
> 	[1]
> http://www.w3.org/2003/03/rdfqr-tests/recording-query-results.html

This can't be generated by naïve templating - it has the size of the result
set and the names of the variables which are not in each query result.
Received on Thursday, 6 May 2004 06:21:46 UTC