- From: Graham Klyne <Graham.Klyne@Baltimore.com>
- Date: Wed, 25 Jul 2001 18:47:30 +0100
- To: Dave Beckett <dave.beckett@bristol.ac.uk>
- Cc: w3c-rdfcore-wg@w3.org
At 12:22 PM 7/25/01 +0100, Dave Beckett wrote: >One way to escape a character. Check. > > [[* Explicit end delimiters MUST be provided. Escapes such as > \uABCD where the end delimiter is a space or any character other > than [01-9A-F] SHOULD be avoided: it is not clear visually, and it > can cause an editor to insert spurious line-breaks when > word-wrapping on spaces. Forms like SPREAD's &UABCD; [SPREAD] or > XML's &#xhhhh;, where the escape is explicitly terminated by a > semicolon, are much better. Escaped characters SHOULD be > acceptable wherever unescaped characters are. In particular, they > SHOULD be acceptable in identifiers and comments. > ]] > >Oh dear; the python style things \uABCD are mentioned as should be >avoided. This is only a recommendation though. > >So I propose we provide one way to escape: > '\u' [A-Fa-f0-9]{1,8} ';' >which generates the appropriate Unicode code point from 1-8 hex digits. Which falls foul of another rule, i.e. inventing a new escaping mechanism. (I assert that adding the ';' terminator changes the escape mechanism.) It is not clear to me that a fixed-length form like \uxxxx or \Uxxxxxxxx actually breaks the rule given above: it depends on one's interpretation of "delimiter": counting is a well-established way of delimiting values in some kinds of structure. Just an observation: I don't really mind which way we go (but also note that a compelling reason that N-triples is as it is is that existing N3 processors can read it). #g ------------------------------------------------------------ Graham Klyne Baltimore Technologies Strategic Research Content Security Group <Graham.Klyne@Baltimore.com> <http://www.mimesweeper.com> <http://www.baltimore.com> ------------------------------------------------------------
Received on Wednesday, 25 July 2001 14:20:55 UTC