W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2011

Re: Escape sequences (SPARQL and Turtle)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Wed, 30 Nov 2011 13:49:19 +0000
Message-ID: <4ED6345F.4030001@epimorphics.com>
To: public-rdf-dawg@w3.org

>> PS I have prototyped [1] in SPARQL and nothing broke nor were any
>> tests affected.
>
> OK, good to know. Does it do anything the the class of parser
> required to tackle SPARQL? I don't know how lexers etc. tackle
> escapes.

No change to the features required.

It's just a tokenizer change and it's just like escapes in strings, 
which we already have so the token rules already do it in a different 
situation.



Full details:

The changes I made were to change the rule for PN_LOCAL

PN_LOCAL  ::=
   ( PN_CHARS_U | [0-9] ) ((PN_CHARS|'.')* PN_CHARS)?
==>
PN_LOCAL  ::=
   ( PN_CHARS_U | [0-9] | PLX ) ((PN_CHARS|'.'| PLX)* PN_CHARS | PLX)?

where

PLX ::=   PERCENT | PLNE

PERCENT is for adding real, unescaped and hex-checked %-encodings 
(variant 2a in the RDF-WG message)

PERCENT ::=  "%" HEX HEX
HEX ::= [0-9] | [A-F] | [a-f]

and PLNE is the \-escapes: a \ followed by one of the characters:

    '\'
    ('~' | '.' | '-' | '!' | '$' | '&' | "'" |
     '(' | ')' | '*' | '+' | ',' | ';' | '=' |
     ':' | '/' | '?' | '#' | '@' | '%' )

Tedious to write out but that's all.

Better token names if done for real.

>
> I imagine that adding a load of characters to what you're allowed to
> put right of the : in a qname in SPARQL will break things?

Putting them in without a \ will break SPARQL 1.0, unless we assume 
whitespace around prefix names, and SPARQL 1.1 property paths.


e.g. SPARQL 1.0 it's extreme but
    a:b:c:d.
which isn't a good idea but is legal - it's
    a:b  :c  :d .

If triggered by having \ before the character there's no problem because 
e.g. \: isn't legal anywhere at all (again, like "strings \" with quotes").

The code needs to de-escape the string if your toolkit doesn't (JavaCC 
doesn't but the escape code is the same as strings so already existed 
for me).

	Andy
>
> - Steve
Received on Wednesday, 30 November 2011 13:49:52 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:47 GMT