Re: RDF query (RDQL) work for Redland

On Monday, August 25, 2003, at 12:40  PM, Dave Beckett wrote:

>
> I've been playing with providing support for the Squish/RDQL style
> querying in Redland and now the W3C's lists are back, I'll report
> what I've got so far.

hello Dave

nice work! :-)

>
> I took the RDQL definition from the Jena RDQL[1] and used that
> grammar plus the examples from Jena and @semantics' tutorials to
> write lex & yacc versions in C for parsing it. The current state is
> that it passes most of the RDQL test suite in Jena bar a few oddities
> that need to be worked out (case sensitivity of tokens, difficulties
> in identifying pattern literals).

I think the latest RDQL Jena2 grammar updated by Andy is the one to  
look at [1] (with regular expression support, optional commas and  
xml:lang and rdf:dataType support on literals) - but as a start old  
Jena 1.x grammar should be enough to test most of current running  
software - Andy has been working on an more up-to-date RDQL spec which  
should be out soon (Andy: anything to say about that?). It should be  
basically what you can see on the Jena2 CVS plus some other fixes (I  
think!). Hopefully that document will be the common RDQL reference  
which implementors can look at and extend it if necessary.

in relation to the RDF query tests work Andy and I converted some of  
the Jena2 RDQL tests to n-triples [2] (which should move to a specific  
sourceforge repository sooner or later)  - the queries/ dir contains  
the native RDQL syntax examples which can be used for your parser  
regression tests (misc examples with constraints, regular expressions,  
xml:lang and rdf:dataType)

> My current issues are on the TODO page:
>   http://www.redland.opensource.ac.uk/rasqal/TODO.html
> and include the problems I've found so far and mentioned above.
>
> It's this list of problems/incompatibilities that are probably
> of most interest to the www-rdf-rules group.
>
>  * base QNames are now allowed

do you mean in the <prefix:localname> form?

>
>  * Add the default prefixes (rdf, rdfs, owl, ... ?)

yes good one - most software already does that into the application -  
it would be handy to have them defaulted directly by the parsing  
software

>
>  * Extensions: multiple LIMIT and OFFSET

what do you mean? can you elaborate more on this?

>
>  * Optionals?

yes - useful all the time I think :)

sometime ago I posted to this list some ideas about possible syntax for  
optionals [3] - what are your ideas about it? do you feel more about  
like optionals triple-patterns or optional/may-bind variables?

I also noticed that Damian Steer has been recently investigating the  
possibility to have optionals for his extended-SquishQL syntax [4]

>
>  * Are keywords case sensitive? Jena RDQL has an example with SELECT
>    ?select WHERE ... but @semantics' RDQL tutorial has an example with
>    USING dcq for ... not FOR

in our implementation we always considered them as case *insensitive*  
due that the RDQL seems not specifying that (or at least Jena seems  
case-insensitive - somebody from Jena correct me if I am wrong) - while  
porting our pure perl RDQL::Parser [5] to C/XS code [6][7] we actually  
used the '-i' lex flag to generate a case insensitive lexer (which  
actually implements some extensions to RDQL such as contexts/4th  
components, LIKE operator and some primitive form of OR on URIs and  
literals in triple-patterns)

>
>  * Literal languages, datatypes - new "lit"@lang and  
> "lit"@lang^^datatype

see latest Jena2 CVS for that

>
>  * Pattern literals seem difficult to recognise without context

we also had some difficulties while designing our lexer especially with  
the hybrid usage of n-triples like syntax in the new RDQL lexer [8] to  
flag xml:lang and rdf:dataType patterns (i.e. not using '<' and '>' to  
group the URI of the datatype for example) - and the regular expression  
pattern syntax which allows to delimit regular-expressions with  
arbitrary characters together with simple slash.

>
>  * Qnames and URIs - in particular what is <a:b>
>    if the prefix a isn't defined till later

I think Sesame tried the N3 (other) way - it would be handy to have  
them defined before to default substitute them while parsing.

>
>  * base URIs? Lots of <relativeURI> seen. @base?

good one

I have some more:

what about using some alternative character to '?' to identify  
variables? we found that the character '?' is conflicting/reserved by  
the SQL standard and treated specially by JDBC/ODBC interfaces - the  
'$' (dollar) sign might be a good alternative :-)

another related is context/provenance - which could be used as 4th  
component of the triple-patterns or using braces like N3 - any idea?

what about a pure/hybrid n-triples++ (with bArcs and 4th component I  
mean) syntax for triple-patterns? :-)

cheers

Alberto

[1]  
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/jena/jena2/ 
doc/RDQL/rdql_grammar.html
[2]  
http://swordfish.rdfweb.org/rdfquery/tests/tests/rdql-tests-2003-04-10/
[3] http://lists.w3.org/Archives/Public/www-rdf-rules/2003Apr/0030.html
[4] http://rdfweb.org/people/damian/esquish/
[5]  
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/rdfstore/ 
rdfstore/lib/RDQL/Parser.pm
[6]  
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/rdfstore/ 
rdfstore/rdql.l
[7]  
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/rdfstore/ 
rdfstore/rdql.y
[8]  
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/jena/jena2/ 
src/com/hp/hpl/jena/rdql/parser/rdql.jjt

Received on Tuesday, 26 August 2003 19:38:34 UTC