W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > August 2006

Determining whether '<' is a beginning of IRI or 'less than' operator

From: Jiri Dokulil <dokulil@gmail.com>
Date: Fri, 18 Aug 2006 17:33:19 +0200
Message-ID: <6a8224ab0608180833r31f8c81flb4d0c3037286aab3@mail.gmail.com>
To: public-rdf-dawg-comments@w3.org

I am not sure how should scanner for SPARQL determine whether '<'
character it encountered is beginning of an IRI or a comparison

Consider these queries:

SELECT * WHERE { ?a ?b ?c, ?d . FILTER(?a<?b && ?c>?d) }
SELECT * WHERE { ?a ?b ?c, ?d . FILTER(?a<?b&&?c>?d) }

Yacker validator results look troubling to me:

The first query validates, the other does not.
My guess is that the validator uses some flex-like scanner, that
prefers the longest tokens. In the first case "<?b && ?c>" can't be
parsed as IRI because of the spaces, so the scanner falls back and
'less than' rule is picked.
On the other hand, "<?b&&?c>" is a valid (according to the grammar)
IRI. But 'variable iri variable' is not a valid FILTER condition and
the parser rejects the query.

The problem is more obvious for scanners with one character
look-ahead, because they are completely unable to distinguish these
two cases.
They also have the same problem with () and [] tokens (NIL and ANON
terminals) but that can easily be solved by going from LL(1) to LL(2).

Jiri Dokulil
Received on Friday, 18 August 2006 16:10:56 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:52:07 UTC