- From: Andy Seaborne <andy@apache.org>
- Date: Fri, 20 Nov 2015 16:22:51 +0000
- To: public-rdf-comments@w3.org
On 17/11/15 23:29, Rob Stewart wrote: > Hi, > > I have a question about "." being escaped in the > localName_with_non_leading_extras turtle parser test case. The input > .ttl file is: > > @prefix p: <http://a.example/>. > p:a·̀ͯ‿.⁀ <http://a.example/p> <http://a.example/o> . > > In the expected case in the .nt file, this subject URI is translated to: > > <http://a.example/a\u00b7\u0300\u036f\u203f\u002e\u2040> > <http://a.example/p> <http://a.example/o> . > > Why is the "." character escaped to \u002e ? > > I would expect the subject URI to be escaped to: > > <http://a.example/a\u00b7\u0300\u036f\u203f.\u2040> > > The input and expected output test cases are: > > http://www.w3.org/2013/TurtleTests/localName_with_non_leading_extras.ttl > http://www.w3.org/2013/TurtleTests/localName_with_non_leading_extras.nt > > This question appears to have been asked before on this list, back in > December 2013 by David Robillard: > > https://lists.w3.org/Archives/Public/public-rdf-comments/2013Dec/0115.html > > For this W3C RDF turtle test case, should "." be escaped to \u002e or > should it not be escaped, as David thought so, and I think I agree. > > David's email was: > > %%%%%%%%% > Hello, > > Why is the "." escaped as \u002e in > > http://www.w3.org/2013/TurtleTests/localName_with_non_leading_extras.nt > > My implementation does not escape this character since, even in the old > NTriples spec, > > absoluteURI ::= ( character - ( '<' | '>' | space ) )+ > character ::= [#x20-#x7E] /* US-ASCII space to decimal 127 */ > > Which includes ".", #x2E. Accordingly, my implementation does not > escape this character. Should it? > My reading is that there is no requirement for it to be escaped, there is no requirement to escape any of the characters - N-Triples is defined using UTF-8 these days. See section 4 on the canonical form which says not to use UCHAR. Or did you mean N-triples as text/plain? See section 6 where it says characters outside ASCII must be escaped for use in text/plain. Using ASCII+UCHAR is not the canonical form. Andy
Received on Friday, 20 November 2015 16:23:26 UTC