- From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Date: Sat, 08 Sep 2012 13:49:42 +0900
- To: Richard Cyganiak <richard@cyganiak.de>
- CC: Gavin Carothers <gavin@carothers.name>, www-international@w3.org, Internationalization Core Working Group Issue Tracker <sysbot+tracker@w3.org>, public-rdf-comments Comments <public-rdf-comments@w3.org>
On 2012/09/08 2:17, Richard Cyganiak wrote: > On 7 Sep 2012, at 17:37, Gavin Carothers wrote: >>>> It's not clear why the \U form should take eight hex digits when the >>>> first two are required to be 0. >>> >>> Because C++ did it and everybody follows them. It's better if all languages >>> have the same representation of strings, even if it's not a very good one. Well, if it were *all* languages, I'd have to agree. But there are way too many programming languages for them to all agree on this :-(. >> Turtle's is inherited from Python, but I believe Python's is from C++ > > \uXXXX and \UXXXXXXXX are also in ISO C AFAIK. > > I like the \u{X} form (where X may be 1-6 hex digits) that seems to be under consideration for ECMAScript. I believe Ruby does this too. Yes for Ruby. Indeed, Ruby is where this form originated. I was in the room when Matz (Ruby's creator) was working it out on a whiteboard; I can figure out the exact date if you need :-). I had stimulated Matz's thoughts in the morning of the same day with a lesser version based on metaprogramming (see http://rubyforge.org/projects/charesc/), but the syntactic elegance of the \u{X} form is all his. Actually, it allows several Unicode codepoints inside the {}, separated by spaces. E.g., \u{BC 378 ABCD 10FFFF}. A single codepoint can also be written without {} if you make sure there are exactly four hex digits (i.e., \uABCD). Anyway, while I'm obviously very fond of this syntax, I don't think it makes any sense to change the well-established escaping syntax in TURTLE at this point. Regards, Martin. > But I feel that Turtle should not add anything new here unless it gets into SPARQL too. > > I feel that the \uxxxx and \UXXXXXXXX forms cannot be removed at this point due to existing implementations and deployed data. Both forms have been in N-Triples since 2004. N-Triples is defined in a W3C Recommendation [1], and Turtle is designed as a superset of N-Triples. > > Best, > Richard > > [1] http://www.w3.org/TR/rdf-testcases/#ntrip_strings >
Received on Saturday, 8 September 2012 04:50:20 UTC