- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Wed, 16 Mar 2005 18:34:50 +0000
- To: "Thompson, Bryan B." <BRYAN.B.THOMPSON@saic.com>
- Cc: 'Eric Prud'hommeaux ' <eric@w3.org>, "'public-rdf-dawg@w3.org '" <public-rdf-dawg@w3.org>
Thompson, Bryan B. wrote: > Per Andy's request, I started on migration of the parser implementation > to the Editor's Draft of SPARQL. I spent the morning on this and I have > summarized some questions below that showed up during that time. However, > I think that I am going to back off and continue with the last working > draft as the basis for my continuing efforts since I am more interested > in exploring SPARQL semantics, since migrating to the new grammar is > probably best done by a re-write (if I was really going to vet the > grammar in the Editor's Draft), and since I don't want to have to re-vet > the grammar multiple times as the draft is edited. The changes to the grammar should now be limited to anything coming out of the sorting discussions. I hope you will continue to provide review and feedback - early working group feedback is very helpful. > Finally, from the > perspective of semantics, most syntax changes (e.g., the turtle syntax) > are not a big deal and it feels like a lot of effort to track a moving > document. > > That said, I would be happy to do a migration to the Editor's draft > once it gets into a "feature freeze" state and before it is released > to last call. At that time I should be able to provide feedback not > only on the grammar, but also on the semantics. > > Some questions on Editor's Draft. > > ? Production [3] specifies <SparqlParserBase>, which is not a defined > lexical production. Fixed - a side effect of running cpp over the gramamr with -DBASE=... :-) which makes sure UNSAID does not creep back in. > > ? Production [56] (Q_URIRef) appears to have a whitespace character in > the [^> ] expression so that a whitespace character is not permitted > within the production. However this is not clear on visual > inspection of the production. ^ is "not" character - that expressions means "not space or >". Spaces can not appear in URIs. > > ? Production 57 (QNAME_NS) permits ":" as a valid QNAME_NS since the > NCNAME_PREFIX is optional in the grammar. Is this an error? If > not, it makes the PrefixDecl production ambiguous. Simplified to just the first rule. > > ? Production 58 (QNAME) reates an ambiguity in the grammar since QNAME > permits "<QNAME_NS> :" without any trailing context. This ambiguity > can be resolved in several ways. For example, by making the "( > NCNAME1 | NCNAME2 )" production non-optional for QNAME. I think this is an ANTLR-ism. Tokenizing in the usual flex/javacc way with greedy consumption of input does not have this problem as far as I know. I have made a change that should remove it anyway. [*] and see below. Aside: as you are using ANTLR, you can either do syntactic or semantic lookahead but then you may wish to make more wholesale changes to the token rules and reduce the number of token productions anyway. > > ? Production 58 (QNAME) would allow ":foo" as a QName. This is NOT a > legal XML QName. If the intention is to permit such constructions, > then the use of "QName" may prove confusing to implementors. ":foo" is legal as is "foo:" and ":" Yes, they are not XML QNames. But they are so widely referred to as qnames in the semantic web community, it would also be confusing to invent a new term. > > ? Production 51 (QName) This production causes conflicts in the > grammar. I modified the production to "(NCNAME_PREFIX)? COLON ( > NCNAME1 | NCNAME2 )", which requires something after the COLON and > which I believe supports the uses of QName in the grammar. [*] This is related to the above. I modifed QNAME (not the grammar rule QName) along the lines suggested. I defined token NCNAME as (NCNAME1 | NCNAME2) and used that through out. Aside: NCNAME1 and NCNAME2 are with and without leading "_" because only one kind is legal for prefixes, but both are local names. qnames can't start with _ because that looks like a blank node. Other fun and games to exclude trailing dots in qnames as WG decision. > > ? Productions 59 (BNODE) and 60 (BNODE_LABEL) are identical. Note > that production 59 (BNODE) is not used and should presumably be > dropped. Removed BNODE - I had changed the name and didn't remove the definition in the formatting system. > > Thanks, > > -bryan > Thanks for the feedback. I'll need to go back and check but with the changes I described, the grammar passes by syntax tests I have. Bryan (and anyone else) - do you have any syntax test cases? If so, I'd be happy to collect them all together, or you can add them to test DAWG test suite. Andy
Received on Wednesday, 16 March 2005 18:35:15 UTC