Re: SquishQL/RDQL Comments

> Do you pass the test cases for such a style of language?
> Libby, Alberto and myself have been getting tests cases
> together.  See www-rdf-rules and the regular IRC chat.

Do you possibly have a more specific pointer? It's been agreed that
RDF query test cases should be housed centrally at
/2003/03/rdfqr-tests, no? That page still says that "[t]here are
currently no tests stored here".

For the squishql.py parser that I wrote yesterday, I've only used the
test files that Libby has provided for SquishQL. The files parsed
correctly, but I haven't yet set up a full test harness that performs
Web queries--i.e. I've only tested the parser, and not its hookup to
my query engine so far.

> For specific RDQL tests, the whole testing set up is in
> the Jena2 download under testing/RDQL/

Thanks. Using those tests, and the grammar (.jjt) file that you point
to below, I've managed to implement an RDQL Python parser, rdql.py.
It's 95% done (it doesn't yet return constraints to the query hookup,
but it does parse them), goes through all of your test files in
estingRDQL/ without error, and is over twice as large as squishql.py.
I'll release it publically if/when I'm happy with the hookup, though
if you like I'd be more than happy to send you what I've got so far.

> The grammar for RDQL is [...]rdql.jjt[...]

Thanks. I think that it would be useful to developrs such as myself to
have a language independent grammar (i.e. BNF or ABNF) at some point,
but what you've pointed me to was good enough to be able to write a
parser, so that's good enough for now! There were some odd stylistic
decisions used (e.g. the assigning of names to operators, confusing
variable names like "SUCHTHAT" for the "and" keyword, etc.), but I'm
sure that in a language agnostic version these could be cleared
up--and the grammar simplified somewhat.

> Note this has already covered several of your points
> below in Jena2.

Yes, and thanks again for pointing me to it! My apologies if I'm
oversubscribing to the "if it hasn't got a URI, it doesn't exist"
principle a bit, here :-)

> Soon there will be a written description of RDQL - I am
> in the process of writing a note about it.

Excellent! That's just what's required, IMO.

> > * <> productions are not further explained (see above).
>
> See URL for the master grammar file.  Sorry - that web page
> was a summary and has slipped behind development of RDQL
> in Jena.  I'll fix that.

Thank you. I noted the difference between URI and quotedUri (and
URL--that's got a "fixme" by it) in the Jena grammar.

> Qnames: already fixed, allowing both old and
> new forms.

So I see from the grammar. Thanks.

> Note N3 has a restricted definition of qnames over
> XML (no '.' aloowed in prefix or local part).

It also doesn't allow hyphen minuses (-).

> Bnodes: writing "_:a" isn't going to work - it is not the same
> bnode as the one you want to match oin the graph

Of course. I didn't mean to imply that matching of labels should
occur, just matching of bNodes. Hmm. Perhaps it's best if I express
this as a test case. Consider the following RDF/NTriples file,
imagining absolutized URIs:-

   <#John> <#likes> <#Chocolate> .
   <#John> <#ownsSome> <#Chocolate> .

Do you agree that "WHERE (?pred <#John> _:objt)" should return ?pred =
<#likes> and ?pred = <#ownsSome> as the bindings? And also that the
same results should occur if both of the objects were _:Chocolate, or
even "Chocolate", but not if they were unequal? In other words, I
treat bNodes as variables that must be bound in the query, but whose
bindings are not returned.

> Not sure what you mean here - the objects returned by
> RDQL/Jena are real API Java objects, not labels.  [...]
> What do you return? the labels for graph nodes/arcs?

I return instances of an "Article" class that are typed as bNodes, and
which are identified by their *local* name. Note that the blank nodes
are not scoped globally, only to their local
graph/formula/context/model/store.

> RDQL in Jena does also return the matching triples.

Ahh, good. I thought I was doing something odd :-)

> Joseki http://www.joseki.org/ returns the minimal complete
> matching subgraph for an RDQL query.

Interesting. I've been looking into subgraph matching algorithms and
so forth a bit...

> [Commas became] [o]ptional in RDQL once I fixed the grammar.

Got it. Thanks.

> More constraints (e.g. date testing) is one of the common feature
> requests I get.

Then you might want to consider using URIs for constraint predicates,
like CWM does, and implementing an easily extensible seperate API for
constraints. Much of the difficulty in making an RDQL parser lies in
the fact that the constraints are not treated generically. I think
that a "<variable> <QName|URI> <arglist> (',' <arglist>)*" type of
syntax for constraints would be much more prferable than the current
setup. You could still keep the current keywords that you have as
aliases for URIs, but I think that the grammar needs some major
cleanup in that area.

To reiterate: a generic grammar and a seperate engine for constraints
would be very helpful for anyone implementing or extending RDQL. And
it'll become more imperative as more constraints are added. Adding
constraints should not mean adding anything to the grammar; using
qnames for any future additions would allow the grammar to remain
stable.

> Actually, you can do constraint testing in cwm using the
> builtin predicates.

Yes; my own query engine does this, in fact, and has done for over a
year now--reusing some of the constraint tests that I wrote for CWM.
Indeed, I implemented SquishQL to see how it compares to the
constraint tests that both CWM and my own engine (and various others
such as Euler) use.

I'd like to take the best features from both approaches and smush them
together in something new.

Cheers,

--
Sean B. Palmer, <http://purl.org/net/sbp/>
"phenomicity by the bucketful" - http://miscoranda.com/

Received on Monday, 2 June 2003 17:08:40 UTC