- From: Graham Klyne <Graham.Klyne@Baltimore.com>
- Date: Wed, 25 Jul 2001 18:47:30 +0100
- To: Dave Beckett <dave.beckett@bristol.ac.uk>
- Cc: w3c-rdfcore-wg@w3.org
At 12:22 PM 7/25/01 +0100, Dave Beckett wrote:
>One way to escape a character. Check.
>
> [[* Explicit end delimiters MUST be provided. Escapes such as
> \uABCD where the end delimiter is a space or any character other
> than [01-9A-F] SHOULD be avoided: it is not clear visually, and it
> can cause an editor to insert spurious line-breaks when
> word-wrapping on spaces. Forms like SPREAD's &UABCD; [SPREAD] or
> XML's &#xhhhh;, where the escape is explicitly terminated by a
> semicolon, are much better. Escaped characters SHOULD be
> acceptable wherever unescaped characters are. In particular, they
> SHOULD be acceptable in identifiers and comments.
> ]]
>
>Oh dear; the python style things \uABCD are mentioned as should be
>avoided. This is only a recommendation though.
>
>So I propose we provide one way to escape:
> '\u' [A-Fa-f0-9]{1,8} ';'
>which generates the appropriate Unicode code point from 1-8 hex digits.
Which falls foul of another rule, i.e. inventing a new escaping
mechanism. (I assert that adding the ';' terminator changes the escape
mechanism.)
It is not clear to me that a fixed-length form like \uxxxx or \Uxxxxxxxx
actually breaks the rule given above: it depends on one's interpretation
of "delimiter": counting is a well-established way of delimiting values in
some kinds of structure.
Just an observation: I don't really mind which way we go (but also note
that a compelling reason that N-triples is as it is is that existing N3
processors can read it).
#g
------------------------------------------------------------
Graham Klyne Baltimore Technologies
Strategic Research Content Security Group
<Graham.Klyne@Baltimore.com> <http://www.mimesweeper.com>
<http://www.baltimore.com>
------------------------------------------------------------
Received on Wednesday, 25 July 2001 14:20:55 UTC