- From: Sean B. Palmer <sean@mysterylights.com>
- Date: Mon, 2 Jun 2003 22:08:35 +0100
- To: "Andy Seaborne" <Andy_Seaborne@hplb.hpl.hp.com>
- Cc: <www-rdf-rules@w3.org>, "Libby Miller" <Libby.Miller@bristol.ac.uk>
> Do you pass the test cases for such a style of language? > Libby, Alberto and myself have been getting tests cases > together. See www-rdf-rules and the regular IRC chat. Do you possibly have a more specific pointer? It's been agreed that RDF query test cases should be housed centrally at /2003/03/rdfqr-tests, no? That page still says that "[t]here are currently no tests stored here". For the squishql.py parser that I wrote yesterday, I've only used the test files that Libby has provided for SquishQL. The files parsed correctly, but I haven't yet set up a full test harness that performs Web queries--i.e. I've only tested the parser, and not its hookup to my query engine so far. > For specific RDQL tests, the whole testing set up is in > the Jena2 download under testing/RDQL/ Thanks. Using those tests, and the grammar (.jjt) file that you point to below, I've managed to implement an RDQL Python parser, rdql.py. It's 95% done (it doesn't yet return constraints to the query hookup, but it does parse them), goes through all of your test files in estingRDQL/ without error, and is over twice as large as squishql.py. I'll release it publically if/when I'm happy with the hookup, though if you like I'd be more than happy to send you what I've got so far. > The grammar for RDQL is [...]rdql.jjt[...] Thanks. I think that it would be useful to developrs such as myself to have a language independent grammar (i.e. BNF or ABNF) at some point, but what you've pointed me to was good enough to be able to write a parser, so that's good enough for now! There were some odd stylistic decisions used (e.g. the assigning of names to operators, confusing variable names like "SUCHTHAT" for the "and" keyword, etc.), but I'm sure that in a language agnostic version these could be cleared up--and the grammar simplified somewhat. > Note this has already covered several of your points > below in Jena2. Yes, and thanks again for pointing me to it! My apologies if I'm oversubscribing to the "if it hasn't got a URI, it doesn't exist" principle a bit, here :-) > Soon there will be a written description of RDQL - I am > in the process of writing a note about it. Excellent! That's just what's required, IMO. > > * <> productions are not further explained (see above). > > See URL for the master grammar file. Sorry - that web page > was a summary and has slipped behind development of RDQL > in Jena. I'll fix that. Thank you. I noted the difference between URI and quotedUri (and URL--that's got a "fixme" by it) in the Jena grammar. > Qnames: already fixed, allowing both old and > new forms. So I see from the grammar. Thanks. > Note N3 has a restricted definition of qnames over > XML (no '.' aloowed in prefix or local part). It also doesn't allow hyphen minuses (-). > Bnodes: writing "_:a" isn't going to work - it is not the same > bnode as the one you want to match oin the graph Of course. I didn't mean to imply that matching of labels should occur, just matching of bNodes. Hmm. Perhaps it's best if I express this as a test case. Consider the following RDF/NTriples file, imagining absolutized URIs:- <#John> <#likes> <#Chocolate> . <#John> <#ownsSome> <#Chocolate> . Do you agree that "WHERE (?pred <#John> _:objt)" should return ?pred = <#likes> and ?pred = <#ownsSome> as the bindings? And also that the same results should occur if both of the objects were _:Chocolate, or even "Chocolate", but not if they were unequal? In other words, I treat bNodes as variables that must be bound in the query, but whose bindings are not returned. > Not sure what you mean here - the objects returned by > RDQL/Jena are real API Java objects, not labels. [...] > What do you return? the labels for graph nodes/arcs? I return instances of an "Article" class that are typed as bNodes, and which are identified by their *local* name. Note that the blank nodes are not scoped globally, only to their local graph/formula/context/model/store. > RDQL in Jena does also return the matching triples. Ahh, good. I thought I was doing something odd :-) > Joseki http://www.joseki.org/ returns the minimal complete > matching subgraph for an RDQL query. Interesting. I've been looking into subgraph matching algorithms and so forth a bit... > [Commas became] [o]ptional in RDQL once I fixed the grammar. Got it. Thanks. > More constraints (e.g. date testing) is one of the common feature > requests I get. Then you might want to consider using URIs for constraint predicates, like CWM does, and implementing an easily extensible seperate API for constraints. Much of the difficulty in making an RDQL parser lies in the fact that the constraints are not treated generically. I think that a "<variable> <QName|URI> <arglist> (',' <arglist>)*" type of syntax for constraints would be much more prferable than the current setup. You could still keep the current keywords that you have as aliases for URIs, but I think that the grammar needs some major cleanup in that area. To reiterate: a generic grammar and a seperate engine for constraints would be very helpful for anyone implementing or extending RDQL. And it'll become more imperative as more constraints are added. Adding constraints should not mean adding anything to the grammar; using qnames for any future additions would allow the grammar to remain stable. > Actually, you can do constraint testing in cwm using the > builtin predicates. Yes; my own query engine does this, in fact, and has done for over a year now--reusing some of the constraint tests that I wrote for CWM. Indeed, I implemented SquishQL to see how it compares to the constraint tests that both CWM and my own engine (and various others such as Euler) use. I'd like to take the best features from both approaches and smush them together in something new. Cheers, -- Sean B. Palmer, <http://purl.org/net/sbp/> "phenomicity by the bucketful" - http://miscoranda.com/
Received on Monday, 2 June 2003 17:08:40 UTC