- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Thu, 04 Aug 2005 16:56:17 +0100
- To: Dan Connolly <connolly@w3.org>
- CC: Dave Beckett <dave.beckett@bristol.ac.uk>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Dan Connolly wrote: > On Thu, 2005-08-04 at 14:36 +0100, Seaborne, Andy wrote: > [...] > >>>Turtle follows N-Triples and picks just uppercase for hex \u & \U >>>escapes (I think there was something in the older charmod drafts about >>>having just one way to encode it). I'd prefer to follow that [0-9] >>>[A-F]. >> >>Can't find anything in current charmod. > > > There are several requirements/guidelines near > http://www.w3.org/TR/2005/REC-charmod-20050215/#def-char-escape > > e.g. > > > C042 [S] Specifications SHOULD NOT invent a new escaping mechanism if > an appropriate one already exists. \u is reasonably common (N-TRIPLES, Java, Python). When in a XML protocol request &...; applies anyway. > > and here's one that might bite a little harder: > > C046 ... In particular, if a character is acceptable in identifiers and > comments, then its escaped form should also be acceptable. > > so charmod says this should work: > > SELECT ?foo\x0045bar WHERE { ?foo\x0045bar dc:title ?xyz }. We can do one of: 1/ Extend \u and \U to apply to variables 2/ Make it so \u is on input processing (before tokenizing) so it works everywhere including comments and is transparent to parsing proper 1/ is probable clearer 2/ is probably easier to implement A test is "SELEC\u0054" Another is the comment # \u escapes need thinking about. which is illegal by 2 but legal by 1. C046 also says "this does not preclude that syntax-significant characters, when escaped, lose their significance in the syntax." so I think we don't have to do it for keywords or things like "?" but 2) would allow it. For \t etc it only applies in strings. I guess that comments are undefined so \t is ambiguous as to whether it is "\" and "t" or a tab. Andy > > This is starting to look like a new issue... I was thinking of > saying this is re-opening punctuationSyntax, but I think it's different. > > >>Unless there is a single convention that will not catch people out, I prefer to >>leave both in - it's not clear to me that there is a convention (I write mine in >>upper case.) > >
Received on Thursday, 4 August 2005 15:58:14 UTC