Re: Scope from Seaborne, Andy on 2001-11-08 (www-rdf-rules@w3.org from November 2001)

From: Seaborne, Andy <Andy_Seaborne@hplb.hpl.hp.com>
Date: Thu, 8 Nov 2001 14:59:05 -0000
To: "'www-rdf-rules@w3.org'" <www-rdf-rules@w3.org>
Message-ID: <5E13A1874524D411A876006008CD059F24E57D@0-mail-1.hpl.hp.com>

RDQL [1] is an implementation of SquishQL for Jena [2] - the syntax for RDQL
and Libby's Inkling are close and converging.

The goals of RDQL are to make retrieving data from RDF models easier than
traversing an API, particularly to have a more declarative way of extracting
data from models.  RDQL allows queries to be made on RDF models from Java on
any Jena model so the query system is independent of the storage
implementation and of the RDF syntax.  RDQL can be mixed with Jena API calls
because a query returns the underlying Jena objects that satisfy the query
so the Resources, Properties or Literals retrieved can be used for model
update or other API calls.

Example: get the contents of a bag (one of many ways):

SELECT ?x, ?y
WHERE (<http://example.com/bag>, ?x, ?y)
AND ! ( ?x eq <rsyn:type> && ?y eq <rsyn:Bag>)
USING rsyn FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 


Query model:

The conceptual model is of a graph pattern (restricted to an explicit graph
with variables in it) and a set of filters, which are boolean expressions
that are applied to the data.  A result is a set of values from the RDF data
graph (including bNodes) that satisfy the graph pattern and all of the
filters.


Application model:

The programming model is JDBC-like : an iterator is returned for the result
set.  The implementation is multithreaded and controls the amount or working
memory it uses so queries against large models work - it has been used on
800,000 statement models.  This is about the only "optimization" currently
done - there are a number of standard dataflow and database style
optimizations that could be done but they are not RDF-related.

Type information is evaluated dynamically in filters because there is no
type information in RDF (yet).  

Queries can be constructed from within Java, rather than through the parser:
this isn't well supported yet although people have managed to do it.  The
internal execution engine is a direct implementation of the query model.


Plans:

Some of the things that are being considered are:

1 - extension mechanism to allow special functions bind variables 
    This would allow RDFS operators.
2 - extension mechanism to allow filter functions
    This would be the way to have string reg ex tests on variables.
3 - better support for queries constructed programmatically
4 - grammar tidying
5 - faster


Above all, what I would like to see is a common core query language so that
tool sets can choose to provide the same basic query and application
programmers don't have to learn a new language for each tool set.  This
would also be good so queries can be shipped over SOAP to different RDF
stores.

	Andy

[1] http://www.hpl.hp.com/semweb/rdql.html
[2] http://www.hpl.hp.com/semweb/

Received on Thursday, 8 November 2001 10:01:39 UTC