- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Thu, 04 Aug 2005 16:56:17 +0100
- To: Dan Connolly <connolly@w3.org>
- CC: Dave Beckett <dave.beckett@bristol.ac.uk>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Dan Connolly wrote:
> On Thu, 2005-08-04 at 14:36 +0100, Seaborne, Andy wrote:
> [...]
>
>>>Turtle follows N-Triples and picks just uppercase for hex \u & \U
>>>escapes (I think there was something in the older charmod drafts about
>>>having just one way to encode it). I'd prefer to follow that [0-9]
>>>[A-F].
>>
>>Can't find anything in current charmod.
>
>
> There are several requirements/guidelines near
> http://www.w3.org/TR/2005/REC-charmod-20050215/#def-char-escape
>
> e.g.
>
>
> C042 [S] Specifications SHOULD NOT invent a new escaping mechanism if
> an appropriate one already exists.
\u is reasonably common (N-TRIPLES, Java, Python).
When in a XML protocol request &...; applies anyway.
>
> and here's one that might bite a little harder:
>
> C046 ... In particular, if a character is acceptable in identifiers and
> comments, then its escaped form should also be acceptable.
>
> so charmod says this should work:
>
> SELECT ?foo\x0045bar WHERE { ?foo\x0045bar dc:title ?xyz }.
We can do one of:
1/ Extend \u and \U to apply to variables
2/ Make it so \u is on input processing (before tokenizing)
so it works everywhere including comments and is transparent to
parsing proper
1/ is probable clearer 2/ is probably easier to implement
A test is "SELEC\u0054"
Another is the comment
# \u escapes need thinking about.
which is illegal by 2 but legal by 1.
C046 also says "this does not preclude that syntax-significant characters, when
escaped, lose their significance in the syntax." so I think we don't have to do
it for keywords or things like "?" but 2) would allow it.
For \t etc it only applies in strings. I guess that comments are undefined so
\t is ambiguous as to whether it is "\" and "t" or a tab.
Andy
>
> This is starting to look like a new issue... I was thinking of
> saying this is re-opening punctuationSyntax, but I think it's different.
>
>
>>Unless there is a single convention that will not catch people out, I prefer to
>>leave both in - it's not clear to me that there is a convention (I write mine in
>>upper case.)
>
>
Received on Thursday, 4 August 2005 15:58:14 UTC