RE: Reqirement 3.5: subgraph results

-------- Original Message --------
> From: Howard Katz <>
> Date: 7 May 2004 14:20
> > From: On Behalf Of Seaborne, Andy
> > Sent: Friday, May 07, 2004 1:58 AM
>      [ snip ... ]
> > > > If a query includes some extension function (after 3.3), say a
> > > > function that takes a radius and the URIs for two geo-spatial
> > > > co-ordinate nodes and returns TRUE if one is in the radius of the
> > > > other. The complete graph used to answer that query is not
> > > > neccesarily known to the query engine - especailly if the
> > > > function is implemented at a lower level. Asking extension
> > > > functions (for example) to give the triples that it used to
> > > > answer the question seems unneccesarily onerous.
> > 
> > If external functions are only filters, saying "accept" or "reject"
> > (or "undefined" probably), based on some bindings of the variables
> > then there does nto appear to be a problem.  External functions
> > should not be able to bind variables.
> Why not? XQuery (among other languages; SQL comes to mind) allows this,
> and 
> it seems to be a useful thing. The environment for example can pass in
> useful, named system constants. Here's a way of introducing the current
> date-time into the query. Here's a way of introducing the name of the
> current user. Since a variable can be bound to a sequence of arbitrary,
> heterogeneous items in XQuery, it can even be used in that environment
> to 
> pass in a system-bound sequence of nodes and atomics of particular
> significance to seed the data-model instance.

Passing in values from the environment can be done in lots of ways, such as
macro substitution into the query.  External functions could be a way to do
it but it is not the only one.  I was thinking of them as calculations in
the query, not for accessing execution environment factors.

There is also an assumption being made that the values are arbitrary.  When
combining query with a fine-grained API, I often use a design pattern when
query to find the graph nodes of interest then use the fine-grain operations
(.listProperties(), .getProperty() etc) to process the nodes found.  The
query returns native programming language objects, including bNodes in the
local case.  

A system where the values the query processes are not the RDF concepts from
the graph can't do that easily.  There is an impedance mismatch if the query
solution provides one style and the rest of the semantic web stack goes for
a different style based on logic.

Where XQuery is the soltuion is in producing XHTM/XML from RDF graphs -
Rob's getting the information out, in this case into an XML pipeline.  If
the acess is graph patterns, not triple patterns, we would be in good shape
for building blocks for various requirements that may be adopted like RDF
subgraphs or encoding the result set in some RDF vocabulary.  Hence my
question to you about variables in your path descriptions.

> The RDF equivalent would
> be to pass in a pre-bound graph of triples.

It might - or it might be passing in values for the query as in query
templates which susbtitute in values before compiling or executing the
query.  This is the use of ? in JDBC/SQL clients.

Passing in triples (premises) gets us into a wider space of rules.  My
preference is that the working group concentrate on a access mechanism, not
a programming language or higher-level semantic web systems.  Application
writers are asking for (expecting) a recommendation now.  Keeping the scope
small means we can do a recommendation is reasonable time (to the timescale
in the charter).  

Opening up to more general processing should be aligned with the rest of the
semantic web stack which isn't there yet.  

So, do a rec now (well, soon); enable a class of access-based RDF-publishing
applications.  This will reveal what needs to be done next.  Second-guessing
the needs of the more complicate scenarios when the effect of simple ones is
unknown is going to lead to long delays and maybe the wrong thing.

> I'm sure smarter minds than me can
> find numerous interesting uses for this type of capability.
> > The current solution of the graph pattern does define a
> > part of the graph so it could be used to define a solution graph.
> > 
> > If the external function returns a solution as a graph it gets
> > unecessarily complicated.
> In what way? Can you explain why you say this, Andy?

Because it goes beyond query and remote access.  The area of proof-traces,
trust provenance is yet to be worked out - there are proposals, a few
experiments, but no clear direction yet as far as I can determine.    Its
very painful not having a provenance solution but a solution to that goes
much further that RDF access.

> > > That's really my main question to the group: are we designing
> > > a language in
> > > which specific and explicit knowledge of graph structure has
> > > to be embedded
> > > in the query?
> > 
> > Going off the subject topic:
> > 
> > On the wider point here, I think there are two issues here: locating
> > parts of the graph of interest and extracting information from the
> > graph aroudn those parts. 
> > 
> > Explicit graph structure is used to locate exactly which nodes are of
> > interest.  Having located the nodes of interest, there is an
> > extraction process that gets data out of the graph - the results
> > (there is than the issue of presenting the results).
> > 
> > If the query can only specify the exact shape of the results as
> > in variables
> > bound in graph patterns then the server can't return all it knows (the
> > motorcycle parts UC), only what the query has scoped.
> > 
> > > It seems to me that we are or should be, but
> > > it's not clear to
> > > me that's been decided one way or another by the group. Has
> > > this question
> > > been answered to other people's satisfaction, or are we still
> > > in a bit of
> > > murky territory here?
> > > 
> > > I don't see a contradiction in allowing external functions to
> > > do things
> > > under the hood and out of sight btw: they provide a needed
> > > escape mechanism
> > > for letting clever implementers get around the limits of having to
> > > be triple-wise explicit in the query. In me 'umble at any rate ...
> > > 
> > > Howard
> > 
> > 	Andy

Received on Monday, 10 May 2004 06:38:00 UTC