Re: I18N-ISSUE-187: escape syntax [TURTLE]

On 2012/09/08 2:17, Richard Cyganiak wrote:
> On 7 Sep 2012, at 17:37, Gavin Carothers wrote:
>>>> It's not clear why the \U form should take eight hex digits when the
>>>> first two are required to be 0.
>>>
>>> Because C++ did it and everybody follows them.  It's better if all languages
>>> have the same representation of strings, even if it's not a very good one.

Well, if it were *all* languages, I'd have to agree. But there are way 
too many programming languages for them to all agree on this :-(.

>> Turtle's is inherited from Python, but I believe Python's is from C++
>
> \uXXXX and \UXXXXXXXX are also in ISO C AFAIK.
>
> I like the \u{X} form (where X may be 1-6 hex digits) that seems to be under consideration for ECMAScript. I believe Ruby does this too.

Yes for Ruby. Indeed, Ruby is where this form originated. I was in the 
room when Matz (Ruby's creator) was working it out on a whiteboard; I 
can figure out the exact date if you need :-).

I had stimulated Matz's thoughts in the morning of the same day with a 
lesser version based on metaprogramming (see 
http://rubyforge.org/projects/charesc/), but the syntactic elegance of 
the \u{X} form is all his.

Actually, it allows several Unicode codepoints inside the {}, separated 
by spaces. E.g., \u{BC 378 ABCD 10FFFF}. A single codepoint can also be 
written without {} if you make sure there are exactly four hex digits 
(i.e., \uABCD).

Anyway, while I'm obviously very fond of this syntax, I don't think it 
makes any sense to change the well-established escaping syntax in TURTLE 
at this point.

Regards,    Martin.


> But I feel that Turtle should not add anything new here unless it gets into SPARQL too.
>
> I feel that the \uxxxx and \UXXXXXXXX forms cannot be removed at this point due to existing implementations and deployed data. Both forms have been in N-Triples since 2004. N-Triples is defined in a W3C Recommendation [1], and Turtle is designed as a superset of N-Triples.
>
> Best,
> Richard
>
> [1] http://www.w3.org/TR/rdf-testcases/#ntrip_strings
>

Received on Saturday, 8 September 2012 04:50:20 UTC