W3C home > Mailing lists > Public > www-international@w3.org > July to September 2012

Re: I18N-ISSUE-187: escape syntax [TURTLE]

From: Richard Cyganiak <richard@cyganiak.de>
Date: Fri, 7 Sep 2012 18:17:39 +0100
Cc: Internationalization Core Working Group Issue Tracker <sysbot+tracker@w3.org>, public-rdf-comments Comments <public-rdf-comments@w3.org>
Message-Id: <AF83DA34-2D30-4D89-858C-DC5BDA1CA44D@cyganiak.de>
To: Gavin Carothers <gavin@carothers.name>, www-international@w3.org
On 7 Sep 2012, at 17:37, Gavin Carothers wrote:
>>> It's not clear why the \U form should take eight hex digits when the
>>> first two are required to be 0.
>> 
>> Because C++ did it and everybody follows them.  It's better if all languages
>> have the same representation of strings, even if it's not a very good one.
> 
> Turtle's is inherited from Python, but I believe Python's is from C++

\uXXXX and \UXXXXXXXX are also in ISO C AFAIK.

I like the \u{X} form (where X may be 1-6 hex digits) that seems to be under consideration for ECMAScript. I believe Ruby does this too.

But I feel that Turtle should not add anything new here unless it gets into SPARQL too.

I feel that the \uxxxx and \UXXXXXXXX forms cannot be removed at this point due to existing implementations and deployed data. Both forms have been in N-Triples since 2004. N-Triples is defined in a W3C Recommendation [1], and Turtle is designed as a superset of N-Triples.

Best,
Richard

[1] http://www.w3.org/TR/rdf-testcases/#ntrip_strings
Received on Friday, 7 September 2012 17:18:10 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 7 September 2012 17:18:10 GMT