Re: Escape sequences (SPARQL and Turtle)

>> PS I have prototyped [1] in SPARQL and nothing broke nor were any
>> tests affected.
>
> OK, good to know. Does it do anything the the class of parser
> required to tackle SPARQL? I don't know how lexers etc. tackle
> escapes.

No change to the features required.

It's just a tokenizer change and it's just like escapes in strings, 
which we already have so the token rules already do it in a different 
situation.



Full details:

The changes I made were to change the rule for PN_LOCAL

PN_LOCAL  ::=
   ( PN_CHARS_U | [0-9] ) ((PN_CHARS|'.')* PN_CHARS)?
==>
PN_LOCAL  ::=
   ( PN_CHARS_U | [0-9] | PLX ) ((PN_CHARS|'.'| PLX)* PN_CHARS | PLX)?

where

PLX ::=   PERCENT | PLNE

PERCENT is for adding real, unescaped and hex-checked %-encodings 
(variant 2a in the RDF-WG message)

PERCENT ::=  "%" HEX HEX
HEX ::= [0-9] | [A-F] | [a-f]

and PLNE is the \-escapes: a \ followed by one of the characters:

    '\'
    ('~' | '.' | '-' | '!' | '$' | '&' | "'" |
     '(' | ')' | '*' | '+' | ',' | ';' | '=' |
     ':' | '/' | '?' | '#' | '@' | '%' )

Tedious to write out but that's all.

Better token names if done for real.

>
> I imagine that adding a load of characters to what you're allowed to
> put right of the : in a qname in SPARQL will break things?

Putting them in without a \ will break SPARQL 1.0, unless we assume 
whitespace around prefix names, and SPARQL 1.1 property paths.


e.g. SPARQL 1.0 it's extreme but
    a:b:c:d.
which isn't a good idea but is legal - it's
    a:b  :c  :d .

If triggered by having \ before the character there's no problem because 
e.g. \: isn't legal anywhere at all (again, like "strings \" with quotes").

The code needs to de-escape the string if your toolkit doesn't (JavaCC 
doesn't but the escape code is the same as strings so already existed 
for me).

	Andy
>
> - Steve

Received on Wednesday, 30 November 2011 13:49:52 UTC