RE: Various result forms from Howard Katz on 2004-05-06 (public-rdf-dawg@w3.org from April to June 2004)

From: Howard Katz <howardk@fatdog.com>
Date: Thu, 6 May 2004 10:05:04 -0700
To: "Seaborne, Andy" <andy.seaborne@hp.com>, "Pat Hayes" <phayes@ihmc.us>
Cc: "RDF Data Access Working Group" <public-rdf-dawg@w3.org>
Message-ID: <IKEOLCDFPBBPPAHGNKKOCEDCELAA.howardk@fatdog.com>
I want to add one more format to the list of possible result forms. I've
been demonstrating this in my XQuery-derived snippets to date, tho not to
its full capability. (For example, I haven't shown how to return triples,
which isn't particularly difficult to do.) :

     5/ XQuery-style freeform results, in which the result format and its
components are created ad hoc in the query itself. One can return, in any
combination and sequence, so-called atomics belonging to whatever subset of
XML Schema Part II datatypes we eventually select, as well as RDF nodes,
properties, literals, and triples. Suitable composition using strings,
integers and the like can provide the equivalent of self-describing,
variable-binding-like annotations on the result set.

Howard

> -----Original Message-----
> From: public-rdf-dawg-request@w3.org
> [mailto:public-rdf-dawg-request@w3.org]On Behalf Of Seaborne, Andy
> Sent: Thursday, May 06, 2004 3:21 AM
> To: Pat Hayes
> Cc: RDF Data Access Working Group
> Subject: RE: Various result forms
>
>
>
>
>
> -------- Original Message --------
> > From: public-rdf-dawg-request@w3.org <>
> > Date: 5 May 2004 19:31
> >
> > 	To try to put requirements 3.2 (Variable Binding
> > Results) and 3.5 (Subgraph
> > 	Results) in some kind of context ...
> >
> >
> > Comments added below from DQL perspective.
> >
> >
> > 	There are a number of result forms that people have
> > used or suggested:  I
> > 	know of:
> >
> > 	1/ Variable bindings
> > 	   Common for the case of get data out of RDF
> >
> >
> > Obtained by answer pattern in form of a vector of variables.
> > Could be done by using XML with <qvar /> tags.
> >
> >
> > 	2/ Result set in RDF
> > 	   As 1/ - encoded in RDF
> >
> >
> > Obtained by answer pattern in form of some lexicalization of
> > RDF with variables inserted. Could be done with RDF/XML with extra tags
> > for variables.
> >
> >
> > 	3/ RDF=>XHTML/XML
> >
> > 	4/ Subgraph extraction:
> > 	   Actually, two forms:
> > 	4a/ The query pattern with variables substituted for each
> >
> > 	    solutions and result merged.
> >
> >
> > answer pattern = query pattern (default in DQL)
> >
> >
> > 	 Reexecuting the query
> > 	    gives the same results.
> >
> >
> > Well, not necessarily. Even assuming the source graph is
> > stable between queries, this would now be a yes/no query
> > (without variables) , so cannot strictly produce the same result.
>
> I don't follow.  The query has variables in it that would be bound against
> the extracted subgraph.
>
> I have seen request from users for this even in the local case - they are
> using a query to define a regionof the graph in the absense of a betetr
> region definition lanaguge.  The extract sub graph is passed onto
> some other
> RDF processing.
>
>
> >
> >
> > 	4b/ The triples from the graph that were matched.
> > 	    Would include, for example, the subclass resource
> > 	    when asked (?x rdf:type x:superclass).
> >
> >
> > Strictly speaking, subclass inheritance reasoning is part of
> > RDFS, not RDF. This raises an issue: are we talking here
> > about RDF or RDF+RDFS ? We need to get quite clear on this,
>
> We do - I stand corrected.  I was thinking RDF+RDFS.
>
> > seems to me, since if we expect this to be 'open-ended' in
> > what counts as suitable backup inference for an RDF query,
> > then RDF querying is indistinguishable from OWL querying.
> >
> > One of the issues we punted on when designing DQL was whether
> > it made sense to return the 'justification' of an answer. In
> > general, this can get *extremely* complicated: the only
> > general form has to be something like a proof or derivation
> > of the answer from the Kbase, which can get arbitrarily long
> > and hairy, and for some inference engines (eg tableaux
> > reasoners) may not even be well-defined. Anyway, this is
> > clearly a research issue. I'd strongly suggest that we back
> > off from this as a requirement, if we expect to finish the WG
> > work this century.  It seems easy for RDF only because RDF
> > itself is so simple, particularly semantically: but it starts
> > getting hairy almost immediately. Even allowing typed
> > literals makes things complicated, and when you get things
> > like subPropertyOf in the mix, the conclusion chains get
> > tricky to follow, eg see the examples in the text at
> > http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#RDFSRules.
> >
> >
> > 	4a == 4b for an RDF graph with no inference processing.
> >
> >
> > But what does that mean? The graph is just what it is, the
> > inference comes from the semantics you assume when you look
> > at it through RDFS or OWL (or whatever) colored glasses.
> >
> >
> > 	5/ RDF => RDF
> > 	   Templating - a generalisation of 4/ where a template (RDF graph
> with
> > 	variables in it) is used to create new RDF at the
> > server.  At the F2F this
> > 	was voted against as a requirement.
> >
> >
> > I wish I had been at the F2F, as this sounds like the DQL
> > idea of an answer pattern. (Though we didnt require it to be
> > RDF, and I see no reason why it should be in particular: it
> > should just be a string with some variables in it, or maybe
> > some XML with <qvar> tags in it.)
>
> The requirements at the F2F were things we saw a necessary requirements.
> That is not to say that templating can't be done.  I'm in favour - but I
> would not be against a proposoal just because it didn't have templating.
> SeRQL has it, cwm has it so there is other implementation experience.  I
> would also say that some of our use cases could make use of it
> where a query
> is just one step in an RDF-driven application.
>
> >
> > Why do we need to specify the answer format? If we allow
> > users to specify it (and maybe provide a handy default for
> > lazy users or beginners) then we avoid quarrels and add to
> > interoperability.
>
> Providing a default is much the same in practice as specifying the format.
>
> >
> >
> > 	Getting information out of RDF directly is 1,2,3.  Part
> > of a larger
> > 	processing system (distributed) is 4 & 5.
> >
> >
> > 	1/ is about the problem of getting information (node
> > and arc labels) out of
> > 	RDF; 2/ is A way of recording 1/.  I have used 2/ to
> > give access to query
> > 	languages from (other language) toolkits that have no
> > query capability.  I
> > 	execute the query remotely (all it takes is to pass a
> > string from
> > 	application to server - the client toolkit does not
> > need to understand the
> > 	QL
> >
> >
> > Yes, quite.
> >
> >
> > 	) and use the result set format [1] as the transfer
> > syntax.  That's
> >
> > 	convenient because the client toolkit can parse and
> > work with the returned
> > 	RDF.  Having the client requirements simple can, for
> > small devices, take
> > 	many forms - this is one of them.
> >
> >
> > 	3/ is important for the display of information directly
> > from RDF sources.
> > 	Using XQuery/XSLT/etc looks to be useful (practical,
> > utilizes programmer
> > 	skills, builds on existing work, what people expect, ...).
> >
> >
> > Right. Seems to me that we could reasonably require that
> > answer patterns are legal XML , and must use our specified
> > markup for our variables. Then users can easily test for
> > unbound variables in an answer, for example, but this allows
> > answers to be formatted as RDF/XML or as something close to
> > plain text strings, or anything in between. It would also
> > allow answers to be things like XHTML as well as RSS.
> >
> >
> >
> > 	4 & 5 are about getting some RDF out of another
> > (larger, remote) RDF
> > 	dataset.  The results would be further manipulated
> > before going to the user,
> > 	and that includes passing in on to other machines where
> > the final
> > 	destination of user/application is not the one making
> > the query; instead the
> > 	extracted subgraph is sent on to other places. This is RDF=>RDF, for
> > 	example, passing around RSS entries.  The general
> > requirement is that part
> > 	of a large, remote target graph is extracted and
> > deliver for further, local
> > 	processing.
> >
> > 	For 4/, examples include the "tell me about" queries
> > and the use of the
> > 	pattern of query to define the subgraph.  In fact, 4a
> > gives an alternative
> > 	way of approaching the example above if the client
> > toolkit does have the QL.
> > 	Re-execution is much, much cheaper, essentially as
> > there are so few negative
> > 	search branches to follow.
> >
> >
> > But why would one ever need to re-execute, if you already
> > have the answer to the query? (what am I missing here?)
>
> It isn't a terrible important usage I agree.  But I have seen examples of
> people using query to extract a region of the graph for passing else where
> so the subgraph might used twice - once for bindings and to pass
> else where.
> Hardly important.
>
> 	Andy
>
> >
> > Pat
> >
> >
> >
> >
> > 	        Andy
> >
> > 	[1]
> > http://www.w3.org/2003/03/rdfqr-tests/recording-query-results.html
>
> This can't be generated by naïve templating - it has the size of
> the result
> set and the names of the variables which are not in each query result.
>
Received on Thursday, 6 May 2004 13:04:13 UTC