W3C home > Mailing lists > Public > public-rdf-wg@w3.org > February 2012

Re: Turtle local-name escapes

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Thu, 02 Feb 2012 22:27:14 +0000
Message-ID: <4F2B0DC2.4000802@epimorphics.com>
To: public-rdf-wg@w3.org


On 02/02/12 18:59, Alex Hall wrote:
> I have a couple of questions about character escapes in the local part
> of Turtle prefixed names
> (http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/index.html#sec-grammar):
>
> [57] <PN_LOCAL>   ::=   ( PN_CHARS_U | [0-9] | PLX ) (( (( PN_CHARS |
> "." | PLX ))* ( PN_CHARS | PLX ) ))?
> [58] <PLX>   ::=   PERCENT | PN_LOCAL_ESC
> [59] <PERCENT>   ::= "%" HEX HEX
> [60] <HEX>   ::=   [0-9] | [A-F] | [a-f]
> [61] <PN_LOCAL_ESC>   ::= "\\" ( "_" | "~" | "." | "-" | "!" | "quot; |
> "&" | "'" | "(" | ")" | "*" | "+" | "," | ";" | "=" | ":" | "/" | "?" |
> "#" | "@" | "%" )
>
> 1. I'm a bit confused about the inclusion of %-encoded octets. It's not
> intended that the pname expansion also decode these octets, is it?

(/me speaking as SPARQL grammar creator)

No. xyz:ab%20de will put 3 characters %-2-0 into the IRI.  Never space. 
  And IRIs still have to be legal, so no <http://example/#a#b> even if 
it passes the token rule.

> e.g. given a prefix ':' for the namespace 'http://example.com/', the
> prefixed name ':foo%2Fbar' expands to the IRI
> 'http://example.com/foo%29bar' and not 'http://example.com/foo/bar',
> correct?

Correct (in SPARQL)

It's made complicated because

RFC 3986 sec 2.3 unreserved characters
"""
URIs that differ in the replacement of an unreserved character with
    its corresponding percent-encoded US-ASCII octet are equivalent: they
    identify the same resource.
"""

"""
For consistency, percent-encoded octets in the ranges of ALPHA
    (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E),
    underscore (%5F), or tilde (%7E) should not be created by URI
    producers and, when found in a URI, should be decoded to their
    corresponding unreserved characters by URI normalizers.
"""

so not Turtle but maybe somewhere else.


> 2. Is the use of "\\" at the start of a character escape sequence a
> formatting bug in the document? Character escapes use a single
> backslash, right?

Looks like copy-over from yacker. Eric!!!!!!

> On an unrelated note, section 5.1 starts off with the sentence "Parsing
> Turtle requires a state of four items:" but the ensuing list has five items.

Open world assumption? :-)

>
> -Alex
>

	Andy
Received on Thursday, 2 February 2012 22:27:41 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:47 GMT