- From: Dave Beckett <dave.beckett@bristol.ac.uk>
- Date: Tue, 9 Sep 2003 15:40:48 +0100
- To: Alberto Reggiori <alberto@asemantics.com>
- Cc: "'www-rdf-rules@w3.org'" <www-rdf-rules@w3.org>, Andy Seaborne <Andy_Seaborne@hplb.hpl.hp.com>
On Wed, 27 Aug 2003 01:35:14 +0200 Alberto Reggiori <alberto@asemantics.com> wrote: > On Monday, August 25, 2003, at 12:40 PM, Dave Beckett wrote: > > > > > I've been playing with providing support for the Squish/RDQL style > > querying in Redland and now the W3C's lists are back, I'll report > > what I've got so far. > > hello Dave > > nice work! :-) Thanks > > I took the RDQL definition from the Jena RDQL[1] and used that > > grammar plus the examples from Jena and @semantics' tutorials to > > write lex & yacc versions in C for parsing it. The current state is > > that it passes most of the RDQL test suite in Jena bar a few oddities > > that need to be worked out (case sensitivity of tokens, difficulties > > in identifying pattern literals). > > I think the latest RDQL Jena2 grammar updated by Andy is the one to > look at [1] (with regular expression support, optional commas and > xml:lang and rdf:dataType support on literals) - but as a start old > Jena 1.x grammar should be enough to test most of current running > software ... Yes, the Jena1 version was what I started with. I've added the optional commas but not the other parts. > ... - Andy has been working on an more up-to-date RDQL spec which > should be out soon (Andy: anything to say about that?). It should be > basically what you can see on the Jena2 CVS plus some other fixes (I > think!). Hopefully that document will be the common RDQL reference > which implementors can look at and extend it if necessary. Now that Jena2 has shipped we can bug Andy again about that :) The new RDQL area is at http://jena.sourceforge.net/RDQL/index.html but it's not clear what has changed. > in relation to the RDF query tests work Andy and I converted some of > the Jena2 RDQL tests to n-triples [2] (which should move to a specific > sourceforge repository sooner or later) - the queries/ dir contains > the native RDQL syntax examples which can be used for your parser > regression tests (misc examples with constraints, regular expressions, > xml:lang and rdf:dataType) > > > My current issues are on the TODO page: > > http://www.redland.opensource.ac.uk/rasqal/TODO.html > > and include the problems I've found so far and mentioned above. > > > > It's this list of problems/incompatibilities that are probably > > of most interest to the www-rdf-rules group. > > > > * base QNames are now allowed > > do you mean in the <prefix:localname> form? I mean there are ambiguous forms: <mailto:dave> which is a legal URI and a legal qname. <ex:a> ex:a are both of these qname (prefix:localname) forms allowed? > > * Add the default prefixes (rdf, rdfs, owl, ... ?) > > yes good one - most software already does that into the application - > it would be handy to have them defaulted directly by the parsing > software This list, if it exists, has to be very well known and short. > * Extensions: multiple LIMIT and OFFSET > > what do you mean? can you elaborate more on this? I've seen that 3Store handles LIMIT (limiting number of results) and returning results from a certain OFFSET. I assume they match some (My)SQL terminology. > > * Optionals? > > yes - useful all the time I think :) > > sometime ago I posted to this list some ideas about possible syntax for > optionals [3] - what are your ideas about it? do you feel more about > like optionals triple-patterns or optional/may-bind variables? No opinion. > I also noticed that Damian Steer has been recently investigating the > possibility to have optionals for his extended-SquishQL syntax [4] > > > > > * Are keywords case sensitive? Jena RDQL has an example with SELECT > > ?select WHERE ... but @semantics' RDQL tutorial has an example with > > USING dcq for ... not FOR > > in our implementation we always considered them as case *insensitive* > due that the RDQL seems not specifying that (or at least Jena seems > case-insensitive - somebody from Jena correct me if I am wrong) - while > porting our pure perl RDQL::Parser [5] to C/XS code [6][7] we actually > used the '-i' lex flag to generate a case insensitive lexer (which > actually implements some extensions to RDQL such as contexts/4th > components, LIKE operator and some primitive form of OR on URIs and > literals in triple-patterns) It was Andy's "SELECT ?select ..." example that broke things since I did havecase-insensitive keywords. That tokenises as <selectKeyword> " " "?" <selectKeyword> which fails to match the grammar since after "?" it must be a legal variable name, not a <selectKeyword> token. I would propose not allowing variable / identifier names to be RDQL keywords, similar restrictions to most general programming languages. > > > > * Literal languages, datatypes - new "lit"@lang and > > "lit"@lang^^datatype > > see latest Jena2 CVS for that > > > > > * Pattern literals seem difficult to recognise without context > > we also had some difficulties while designing our lexer especially with > the hybrid usage of n-triples like syntax in the new RDQL lexer [8] to > flag xml:lang and rdf:dataType patterns (i.e. not using '<' and '>' to > group the URI of the datatype for example) - ... You mean "foo"^ex:a rather than "foo"^<http://example.org/a> There are three options after ^ - either @, ^ or it must be a qname. > ... and the regular expression > pattern syntax which allows to delimit regular-expressions with > arbitrary characters together with simple slash. It seems to me that the lexer has to accept a wide variety of things when it is expecting a pattern literal as the next token and cannot recognise a pattern literal without that context. It would be good to reduce these problems somewhat. > > > > * Qnames and URIs - in particular what is <a:b> > > if the prefix a isn't defined till later > > I think Sesame tried the N3 (other) way - it would be handy to have > them defined before to default substitute them while parsing. Yes, or at least if it was allowed to interpret them in a way that was equivalent to that. > > > > * base URIs? Lots of <relativeURI> seen. @base? > > good one There are more. I would suggest that after how CSS does this, there should be a way to set the content encoding of the document. (What is the default? ASCII?) As I recall, it uses @encoding as the first or early line of the document. > I have some more: > > what about using some alternative character to '?' to identify > variables? we found that the character '?' is conflicting/reserved by > the SQL standard and treated specially by JDBC/ODBC interfaces - the > '$' (dollar) sign might be a good alternative :-) ? works for me > another related is context/provenance - which could be used as 4th > component of the triple-patterns or using braces like N3 - any idea? Not at present > what about a pure/hybrid n-triples++ (with bArcs and 4th component I > mean) syntax for triple-patterns? :-) No thanks. bArcs means it isn't RDF, and a 4th triple component guarantees that. > cheers > > Alberto > > [1] > http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/jena/jena2/ > doc/RDQL/rdql_grammar.html > [2] > http://swordfish.rdfweb.org/rdfquery/tests/tests/rdql-tests-2003-04-10/ > [3] http://lists.w3.org/Archives/Public/www-rdf-rules/2003Apr/0030.html > [4] http://rdfweb.org/people/damian/esquish/ > [5] > http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/rdfstore/ > rdfstore/lib/RDQL/Parser.pm > [6] > http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/rdfstore/ > rdfstore/rdql.l > [7] > http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/rdfstore/ > rdfstore/rdql.y > [8] > http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/jena/jena2/ > src/com/hp/hpl/jena/rdql/parser/rdql.jjt > > >
Received on Tuesday, 9 September 2003 11:10:47 UTC