- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Fri, 5 Aug 2005 15:28:03 -0400
- To: Dave Beckett <dave.beckett@bristol.ac.uk>
- Cc: Andy Seaborne <andy.seaborne@hp.com>, Steve Harris <S.W.Harris@ecs.soton.ac.uk>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
- Message-ID: <20050805192803.GA3162@w3.org>
On Fri, Aug 05, 2005 at 11:37:20AM +0100, Dave Beckett wrote: > > On Fri, 2005-08-05 at 10:03 +0100, Steve Harris wrote: > > On Thu, Aug 04, 2005 at 04:56:17 +0100, Andy Seaborne wrote: > > > >so charmod says this should work: > > > > > > > > SELECT ?foo\x0045bar WHERE { ?foo\x0045bar dc:title ?xyz }. > > > > > > We can do one of: > > > 1/ Extend \u and \U to apply to variables > > > 2/ Make it so \u is on input processing (before tokenizing) > > > so it works everywhere including comments and is transparent to > > > parsing proper > > > > > > 1/ is probable clearer 2/ is probably easier to implement > > 2/ is way harder to implement for me. The lexer and parser I've been > using do not work on unicode code points so adding an extra layer in > there will substantially complicate things. I will likely ignore 2/ for > some time if it was chosen and mark any tests for it as will-not-pass. > > > > A test is "SELEC\u0054" > > > > > > Another is the comment > > > > > > # \u escapes need thinking about. > > > > > > which is illegal by 2 but legal by 1. > > > > Unless comment processing is also done on input (ala the C pre-processor) > > I dont know if I like the idea of > > /* comment \u002A/ > > being a valid comment > > Since comments have no in-language interpretation, they can include any > printable byte that matches the grammar. If utf8 items or \uxxx are in > comments, software doesn't care. I think Steve meant that one of the comment boundries was formed by the escape sequence. For example: SELECT * # get it all\nWHERE {?x\u0020foo:bar\u0020?y} (un-escapes to: SELECT * # get it all WHERE {?x foo:bar ?y} This would be a possible side effect of taking Andy's option 2 (if escape expansion is done prior to, rather than during tokenizing). -- -eric office: +81.466.49.1170 W3C, Keio Research Institute at SFC, Shonan Fujisawa Campus, Keio University, 5322 Endo, Fujisawa, Kanagawa 252-8520 JAPAN +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA cell: +81.90.6533.3882 (eric@w3.org) Feel free to forward this message to any list for any purpose other than email address distribution.
Received on Friday, 5 August 2005 19:28:06 UTC