- From: Dave Beckett <dave.beckett@bristol.ac.uk>
- Date: Fri, 05 Aug 2005 11:37:20 +0100
- To: Andy Seaborne <andy.seaborne@hp.com>, Steve Harris <S.W.Harris@ecs.soton.ac.uk>
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
On Fri, 2005-08-05 at 10:03 +0100, Steve Harris wrote: > On Thu, Aug 04, 2005 at 04:56:17 +0100, Andy Seaborne wrote: > > >so charmod says this should work: > > > > > > SELECT ?foo\x0045bar WHERE { ?foo\x0045bar dc:title ?xyz }. > > > > We can do one of: > > 1/ Extend \u and \U to apply to variables > > 2/ Make it so \u is on input processing (before tokenizing) > > so it works everywhere including comments and is transparent to > > parsing proper > > > > 1/ is probable clearer 2/ is probably easier to implement 2/ is way harder to implement for me. The lexer and parser I've been using do not work on unicode code points so adding an extra layer in there will substantially complicate things. I will likely ignore 2/ for some time if it was chosen and mark any tests for it as will-not-pass. > > A test is "SELEC\u0054" > > > > Another is the comment > > > > # \u escapes need thinking about. > > > > which is illegal by 2 but legal by 1. > > Unless comment processing is also done on input (ala the C pre-processor) > I dont know if I like the idea of > /* comment \u002A/ > being a valid comment Since comments have no in-language interpretation, they can include any printable byte that matches the grammar. If utf8 items or \uxxx are in comments, software doesn't care. Dave
Received on Friday, 5 August 2005 10:37:40 UTC