A question on Test 29 of Turtle Tests

Hi,

I'm trying to understand how the following file is a valid NTriple
file (which is found in the Turtle compliance tests):
http://www.w3.org/TeamSubmission/turtle/tests/test-29.out

The question is about the object in the triple starting with:
"<scheme:\u0001" which, following NTriple escaping, a URI Reference
"scheme:" followed by unicode character 0001.  It seems like this is
an invalid URI.

It confused me because the manifest actually says "Escaping U+0001 to
U+007F in a URI".  It would seem that this example would be simpler to
a be literal rather than a URI.

The way to parse these files seems to be to perform NTriple escaping
and then parse the string as a (absolute) URI.  That's how I get an
invalid URI so I must be doing something simple wrong.

Maybe a solution is to go straight from NTriple escaping to URI
escaping (\u0001 -> %01)?  Except of course, those that are "ALPHA
(%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E),
underscore (%5F), or tilde (%7E)" (from the RFC).

-Andrew

Received on Sunday, 3 October 2010 22:39:44 UTC