Re: possible syntax changes

On Tue, 31 May 2005 18:35:13 +0100, "Seaborne, Andy" <andy.seaborne@hp.com> wrote:

> 
> 
> Dave Beckett wrote:
> > I had a skim through
> > http://www.w3.org/2001/sw/DataAccess/rq23/#grammar
> > $Revision: 1.367 $ of $Date: 2005/05/31 08:31:36 $
> > 
> > 
> > and here are some comments
> > 
> > 1. Query verb tokens
> > 
> > tokens [5]-[8] SelectQuery...AskQuery have changed structure a lot.
> > I'm pretty unlikely to be able to implement them with my current
> > approach as this part seems to require a lot of look ahead.
> 
> Could you say where?
> 
> I parse these constructs with lookahead of one with an LL parser.  FROM/FROM 
> NAMED is a point that does need something but it is written the way it is to 
> just cover zero or one FROM and zero or more FROM NAMED.

I need to figure out what changed in those tokens since the last WD
and what it means, or you could tell me.  There seem to be a lot of
optional (0 or 1) tokens together in a construct.  Maybe I can use
the existing grammar I have but I can't tell without some further
investigation.

> There is no necessity to use exactly the same grammar to get the same language. 
>   There are many grammars for one language.
> 
> [PS http://compilers.iecc.com/comparch/article/05-04-059]

Clearly.


> > 
> > 
> > 2. Optional WHERE
> > 
> > WHERE is still optional.  I prefer not, and won't teach it as
> > optional, will give syntax warnings when recognising this.  It just
> > ends up allowing bizarro queries.

Any comment?

> > 
> > 3. Optional '.'s
> > 
> > Too many of these,  I can't recall if I've caught them all.  If it
> > makes my grammar have some shift/reduce conflicts or require
> > excessive lookahead, I probably won't.
> 
> What counts as excessive?  Is it unbounded?

I recall getting a large number of extra conflicts when I added
optional '.'s somewhere, so I backed out for now.  Nobody has
noticed so far.

> It is, of course, your choice as to what you implement.
> 
> > 
> > 
> > 4. Casting support
> > 
> > We lost the foo:bar() vs &foo:bar() distinction some grammars back.
> > which is syntax for a casting operation and an extension function.
> > 
> > I think overloading these is a mistake.  The spec defines a set of
> > required casting operations named for their datatype like xsd:byte()
> > This means you can take any literal and make a datatype of that form,
> > with the value that the datatype defines.
> > 
> > In RDF itself, you can easily make a lexical form for any RDF
> > datatyped literal in RDF/XML or other syntax, and you can do that in
> > in SPARQL if you only want to talk about a constant datatype:
> > "abc"^^dave:type.
> > 
> > However, if you want to use it in expressions you can't do much
> > except use constants.  So whereas:
> >   FILTER "abc"^^dave:type = "abc"^^dave:type
> > is true by RDF rules, you can't take a variable ?x with the literal
> > value "abc" and do:
> > 
> >   FILTER ?x^^dave:type = "abc"^^dave:type
> > 
> > however, if dave:type is one of the built-ins, you can:
> >   FILTER xsd:integer(?x) = "10"^^xsd:integer
> > 
> > but you could previously, when we had syntax for it:
> >   FILTER dave:type(?x) = "abc"^^abc:type
> > 
> > which does the *make an rdf datatyped literal* operation but is now
> > used solely for functions.
> > 
> > So, I'd like a new operator CAST
> >   CAST(URI u, Unicode string s) returning what would be written as s^^u
> 
> Does this include:
> 
> CAST(?x, "foobar")
> 
> that is, a dynamic cast?  

of course.  A static cast would be of little use.

> This seems more like a constructor than a cast as the 
> value is changed.  This would make compiling to SQL quite hard, I guess.
>
> What about language tags? Do you want
> 
> lang("chat", "fr")?
> 
> to complete the set?

Not particularly, somebody else might.
 
> > 
> > 
> > 5. Turtle parts of grammar 
> > 
> > Around tokens [28] to [44] I gave up trying to follow the grammar and
> > instead used the Turtle grammar itself and the lex+yacc code I
> > already had for that.  
> 
> OK - note that Turtle requires whitespace where N3 does not:
> 
> N3:: <s><p><o>.
> Turtle:: <s> <p> <o>.

Yes.  The design is that all turtle is N3 as far as is possible, not the
other way around.

Whitespace is a very minor point compared to sparql grammar terms
such as the structure of PropertyList, PropertyListNotEmpty,
PropertyListTail of which I have no idea why they exist.  The grammar
has no discussion so it either has to be obvious what's going on and
what structures to make, or you lose.

> also SPARQL went for XML qnames as per the discussion in section 8
> not as per the gramamr which limits to letters in a-z

My sparql grammar uses the sparql definitions of the names, var names
as they exist as tokens [58]ish (depending how you measure the numbers).

This is also a minor point compared to how the tokens are used to
describe the triple and abbreviations such as tokens like GraphNode
and TriplesNode in the sparql grammar.  Again, I'm not sure why that
is different from Turtle's EBNF.  Maybe you started from N3 and not
Turtle which would allow probably allow things that aren't Turtle
even when you avoid $vars and []s in predicates.  I already gave one
example:

SPARQL and not Turtle:  [ :a :b ] .

this is writable in Turtle as
   [] :a :b .
and more clearly shows the triples.

> [re: @charset: We have had bad experiences with declaring charsets
> inside a file (e.g. it can disagree with the MIME type; it might be
> opened with a I/O stream that already does conversion (usually
> wrongly!)).]

Ok, that's a point for the Turtle document.  I wasn't planning to add
it and this gives a good reason why not to.

<snip/>

Dave

Received on Wednesday, 1 June 2005 10:09:45 UTC