W3C home > Mailing lists > Public > www-archive@w3.org > February 2003

RE: N3 paths : use of .

From: Seaborne, Andy <Andy_Seaborne@hplb.hpl.hp.com>
Date: Fri, 14 Feb 2003 16:59:15 -0000
Message-ID: <5E13A1874524D411A876006008CD059F061D6521@0-mail-1.hpl.hp.com>
To: "'Dan Connolly'" <connolly@w3.org>
Cc: "'timbl@w3.org'" <timbl@w3.org>, "'www-archive@w3.org'" <www-archive@w3.org>, "'Sean B. Palmer'" <sean@mysterylights.com>

Dan,

Yes - I used those grammars in creating mime for antlr :-)  Thanks for
saving me time.

> We were trying to replace the notation3.py hairball
> with that, but we ran into a snag; I can't remember
> what it was.

Do you still intend to replace notation3.py with a yapps system or is that
on hold? If you do remmber, I'd be interested.

> I have made my peace with some look-ahead/nastiness in the lexer.

Agreed - I have had to so some messing around in the lexer - it was either
that or have the  lexer produce very low level terms like "sequence of
chars".  One case was qname prefixes - I tried to make QName a lexer term
but keywords (this, of, a, is, has) and qnames and prefix declarations are
all bare words so I had to have some semantic lookahead.  @langtag and
@prefix clash in the same way.

I think yapps provides more context control that antlr in the set of legal
tokens (reading its web page).  antlr, by default, likes a lex-like split -
the choice of token for the same sequence of characters is fixed, and  can
be identified without grammar context.  I have tokens for keywords so "this"
and "thisisnt:" needs 5 chars of processing.  Default lookahead is 3 chars
to deal with start of """ strings.  I haven't been worried about speed.

At the grammar level, things went smoothly except for actually generating
the triples composing a path which does not heppen in parser traverse order.
No big deal.

	Andy 

-----Original Message-----
From: Dan Connolly [mailto:connolly@w3.org] 
Sent: 14 February 2003 14:44
To: Seaborne, Andy
Cc: 'timbl@w3.org'; www-archive@w3.org; Sean B. Palmer
Subject: Re: N3 paths : use of .


[+cc seanb, www-archivr]

On Tue, 2003-02-11 at 12:16, Seaborne, Andy wrote:
> Tim, Dan, 
> 
> I came across this while updating Jena's N3 parser:
> 
> http://www.w3.org/2000/10/swap/doc/Shortcuts.html
[...]


> 2/ Writing parsers: I do not claim to much experise with parser genrators
> but having a syntax that is whitespace sensitive makes it hard to use
> standard parser tools antlr, javacc, yacc/flex etc.  (Don't know yapps
well
> enough.)  Similarly, context sensitive tokenization is not so simple in
> these tools.

I have made my peace with some look-ahead/nastiness in the lexer.
I still have hope for a tool-generated parser. Have
you seen this?

  RDF Notation3 Grammar
  http://www.w3.org/2000/10/swap/rdfn3-gram
  http://www.w3.org/2000/10/swap/rdfn3-gram.html
  http://www.w3.org/2000/10/swap/rdfn3.g

We were trying to replace the notation3.py hairball
with that, but we ran into a snag; I can't remember
what it was.


-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Friday, 14 February 2003 11:59:54 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 July 2008 08:08:50 GMT