Re: CWM Tokenization Error

Sean B. Palmer wrote:

> Yosi Scharf wrote:
>
>> Is that actually expected?
>
>
> I'm fairly sure that it is, based on the documentation in n3.n3, but of
> course one can easily make the case that the expected result is whatever
> CWM does. Here's the relevant piece:
>
Here is the point I had wanted to make originally: If, in fact you fixed 
this, then taking e.n3:
@keywords a_m . d a_m .
Would still fail in n3p.py, because a_m is a valid barename, but not a 
valid keyword, besides there not being a triple. Running it through 
n3pp.py would produce:
@keywords .
 :d @a_m .
which would parse by your reasoning below. So that would become another 
case of n3pp.py ``fixing'' invalid n3.

Yosi

> [[[
> # tokenizing:
> # Absorb anything until end of regexp, then stil white space
> # [...]
> #  WS MUST be inserted between tokens where ambiguity would arise.
> #  (possible ending characters of one and beginning characters overlap)
> ]]] - http://www.w3.org/2000/10/swap/grammar/n3.n3
>
> Since underscore isn't allowed in declarations and language codes, I
> think that "@a_m" is unambiguously equivalent to "@a _m". Again,
> according to the RDF BNF, underscore can start a prefixless QName.
>
> This is also broken in n3p.py (and predictiveParser.py, from which it's
> derived) as you rightly point out; 'tis next on my n3p todo list. I'd
> race you to fix it, CWM vs. n3p, but I'm sure you'd win :-)
>
> Cheers,
>

Received on Monday, 17 January 2005 17:08:34 UTC