- From: Dan Connolly <connolly@w3.org>
- Date: Mon, 29 Aug 2005 08:35:44 -0500
- To: andy.seaborne@hp.com
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
On Wed, 2005-08-24 at 16:23 +0100, Seaborne, Andy wrote: > In response to comments on the grammar and escapes, I have updated the rq23 > v1.470 grammar section. Ah... great... > The grammar is LL(1), I don't see that stated in the document. http://www.w3.org/2001/sw/DataAccess/rq23/#grammar I think it's very valuable to let people know. It's the grammar excepting the (suggested) TERMINALS that's LL(1), yes? > addressing Richard Newman's And Tim Berners-Lee's comments > on the grammar. > > http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Aug/0055.html > http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Aug/0067.html > > Also addresed is Walid Maalej's comment on variable names and leading digits. > It does not make variable names full NCNAMEs because that would includes "-" and "." > > http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Aug/0038.html > Changes: > > 1/ Triples rule changes : this is the last thing that stopped it being LL(1) When discussing this with yosi, I discovered that SELECT ?x WHERE {.} was in the language of the LC grammar. Does this new grammar allow that? I sorta prefer that it does not, but I think we owe the world a test case to show that we made the change on purpose. Volunteers? > 2/ There is an explicit rule for IRIRefOrFunction() in expressions to make it > clearer about this case (Dave;'s comment) > > 3/ IRI references are: '<' ([^<>]-[#00-#20])* '>' > that is, excludes some characters but is not a full IRI gramamr. Very well. > There is also text in the grammar section to say that IRI must be valid so no > <a###b>. Hmm... "Any IRI references in a SPARQL query string must valid according to RFC 3987 [RFC3987] and RFC 3986 [RFC3986]." So <a##b> makes it not a SPARQL query string, rather than saying that it _is_ a sparql query string with an error and hence this spec doesn't define anything else about it, like its abstract form or what the corresponding results are. I wonder what the protocol implications of that are. I think it means servers have to check the spelling of URIs and must not return a 200 OK in this case. Does anybody currently do that? > 4/ Removed rule RDFTerm (again!) which was never used. > > 5/ Escapes: the grammar itself has rules for handling \t etc in strings but the > Unicode codepoint escapes (\u and \U) are not included in the grammar because it > would require enumerating everything twice, once for the plain character, once > for the \u form/ > > \u and \U are allowed in varibales names, qnames, strings and IRIs. Hmm... that seems to say that we're using a notation that's very similar to the XML 1.1 grammar notation, but with a few tweaks. The sections on comments, keywords, whitespace and escapes are grammar notation tweaks. > (A practical alternative would to allow \u forms, not restrict the codepoint > space, and have text to cover things like "don't put \u0020 in an IRI"). > > This grammar has no local lookahead and has been checked for LA requirements > with JavaCC, it has been fed to yacker (it's grammar "afs1" > http://www.w3.org/2005/01/yacker/uploads/afs1/bnf?lang=perl > except from (3) above the character class difference isn't supported so it is a > slightly weaker '<' ([^<>])* '>' . Yacker produces bison, yacc and Perl-based > parsers with no errors. Please let's share that info with the world. Let's publish those bison, yacc, and perl-based parsers as non-normative linked files. And turtle, if it's not much trouble. -- Dan Connolly, W3C http://www.w3.org/People/Connolly/ D3C2 887B 0F92 6005 C541 0875 0F91 96DE 6E52 C29E
Received on Monday, 29 August 2005 13:35:47 UTC