Re: Keeping PrEfIx and BaSe Proposals

On 30/05/13 03:28, Eric Prud'hommeaux wrote:
> * Sandro Hawke <sandro@w3.org> [2013-05-29 20:26-0400]
>> On 05/29/2013 12:29 PM, Gavin Carothers wrote:
...

>>> Example grammar change from gkellog:
>>>
>>> [4] prefixID ::= '@'? [Pp][Rr][Ee][Ff][Ii][Xx] PNAME_NS IRIREF "."?
>>> [5] base ::= '@'? [Bb][Aa][Ss][Ee] IRIREF "."?
>>>
>>
>> There's a lot to be said for that, yes.
>
> Is the intention that these all be valid:?
>    prefix : <> PREfix : <>
>    prefix : <> . PREfix : <> .
>    @ prefix : <> @ PREfix : <>
>    @ prefix : <> . @
>    PREFIX : <>
>    .
>
> Grammar nit: I like that SPARQL separates tokenizing from parsing (as
> does Turtle). We could follow suite with:
>
>      prefixID ::= '@'? PREFIX PNAME_NS IRIREF "."?
>      base ::= '@'? BASE IRIREF "."?
>      Terminals:
>      PREFIF ::= [Pp][Rr][Ee][Ff][Ii][Xx]
>      BASE ::= [Bb][Aa][Ss][Ee]
>
> or we use our current approach:
>
>      Keywords in single quotes ('@base', '@prefix', 'a', 'true', 'false') are case-sensitive. Keywords in double quotes ("BASE", "PREFIX") are case-insensitive.
>
> by striking the '@base', '@prefix'.

1/ The wider range of valid input (Eric's point) + a worse case.
2/ Problems with token LANGTAG

1/ ==>
Eric - good catch.

It's more than a grammar nit.

Being a grammar rule and not a token rule, this allows whitespace 
between @ and prefix, rather than @prefix being a token, no whitespace. 
-1 to that; I'm not sure Gregg intended that.

I really don't like
---------
@
prefix : <http://example/> .
---------

Actually, as gregg/gavin originally wrote as a grammar rule whitespace 
can occurs between any tokens and each letter is a token, it allows

@ p r e f i x : <http://example/>.

and

@ p r e
f i x : <http://example/>.

and I'm fairly confident that was not intended.

2/ ==>

Technical point:

LANGTAG is a token:

[144s] 	LANGTAG 	::= 	'@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*

and so tokenization will grab '@prefix'

The LC grammar called out '@prefix' as a specific token which means it 
is not a problem, neither allowing internal white space, horizontal or 
vertical, nor having the LANGTAG token accept it.

The grammar is supposed to be simple for easy implementation in 
handwritten, LL, and LALR styles.

Eric's existing design (token for @prefix and @base) is better.

	Andy

>
>
>>        -s
>>> Cheers,
>>> Gavin
>>
>>
>

Received on Thursday, 30 May 2013 09:54:24 UTC