- From: Gregg Kellogg <gregg@greggkellogg.net>
- Date: Sat, 3 Nov 2012 19:33:56 -0400
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- CC: RDF WG <public-rdf-wg@w3.org>
The following tests in the Turtle Syntax Tests look for a parser error, but I think they're actually correct syntax:
syn-bad-uri-02 [1]
# Bad IRI : bad escape
<http://example/\u0020> <http://example/p> <http://example/o> .
syn-bad-uri-05 [2]
# Bad IRI : hex 3C is <
<http://example/\u003C> <http://example/p> <http://example/o> .
syn-bad-uri-06 [3]
# Bad IRI : hex 3E is >
<http://example/\u003E> <http://example/p> <http://example/o> .
The Turtle Grammar allows any unicode escape to be part of the IRI, and is not restrictive of escapes that match what would be illegal if they are unescaped.
[19] IRIREF ::= '<' ([^#x00-#x20<>\"{}|^`\] | UCHAR)* '>'
[27] UCHAR ::= '\u' HEX HEX HEX HEX | '\U' HEX HEX HEX HEX HEX HEX HEX HEX
I think these should be good syntax tests. If that is the case, my processor now passes all of the RIOT Turtle and TurtleSubm tests except the following:
test-19.ttl [4] includes illegal characters in IRIs: ", {, |, and }
tests 14-16 either take too long to run to be useful, or are just too stressful of my implementation. I would be happy if they were excluded.
Gregg Kellogg
gregg@greggkellogg.net
[1] http://svn.apache.org/repos/asf/jena/Experimental/riot-reader/testing/RIOT/Lang/Turtle/syn-bad-uri-02.ttl
[2] http://svn.apache.org/repos/asf/jena/Experimental/riot-reader/testing/RIOT/Lang/Turtle/syn-bad-uri-05.ttl
[3] http://svn.apache.org/repos/asf/jena/Experimental/riot-reader/testing/RIOT/Lang/Turtle/syn-bad-uri-06.ttl
[4] http://svn.apache.org/repos/asf/jena/Experimental/riot-reader/testing/RIOT/Lang/TurtleSubm/test-29.ttl
Received on Saturday, 3 November 2012 23:34:39 UTC