- From: Wouter Beek <w.g.j.beek@vu.nl>
- Date: Wed, 28 Jun 2017 17:14:33 +0200
- To: SW-forum Web <semantic-web@w3.org>
Hi Semantic Web community,
Is [1] a valid N-Triples 1.1 statement (notice there are no spaces
between the terms)? I'm able to find evidence that says "yes" and
able to find evidence that says "no". I think "yes" is correct here,
but the specification document is able to trick a casual reader into
believing "no", several SotA tools currently implement according to
"no", and the test case is ambiguous (see below).
[1] <x:y><x:y><x:y>.
No, [1] is not valid:
* The N-Triples 1.1 specification says that "The simplest triple
statement is a sequence of (subject, predicate, object) terms,
separated by whitespace and terminated by '.' after each triple."
Under a certain interpretation of the word 'simplest' this means
that whitespace is required, because if a statement S that is
expressed by a triple T can also be expressed by a triple T' which
only differs from T in that is contains no whitespaces, it follows
that T is not the simplest triple expressing S.
* The N-Triples specification defines canonical N-Triples as "The
whitespace following subject, predicate, and object must be a
single space, (U+0020)." This implies that whitespace indeed
follows the subject, predicate, and object terms ("the whitespace"
does not refer to the empty string).
* rdflib 4.2.2 gives an error when parsing [1].
* SWI-Prolog Semweb library 7.5.10 gives an error when parsing [1].
* There is a test case called `#minimal_whitespace' which has value
`rdft:approval rdft:Proposed', and according to the test suite
vocabulary this means that the test is "proposed but not
approved", so minimal whitespace is a proposal that is not yet
part of N-Triples 1.1.
* It should always be possible to easily split an N-Triple statement
based on whitespace characters.
Yes, [1] is valid:
* The N-Triples 1.1 specification clearly states that "triples are a
sequence of RDF terms representing the subject, predicate and
object of an RDF Triple. These may be separated by white space
(spaces U+0020 or tabs U+0009)."
* Serd 0.26.0 correctly parses [1].
* Jena 3.0 correctly parses [1].
* Raptor 2.0.15 correctly parses [1].
* There is a test case called `#minimal_whitespace' which is of type
`rdft:TestNTriplesPositiveSyntax'.
* The N-Triples 1.1 specification says that "White space (tab U+0009
or space U+0020) is used to separate two terminals which would
otherwise be (mis-)recognized as one terminal." This may imply
that whitespace is not required for separating non-terminals?
What can we do to clear up the situation regarding the use of
whitespace in N-Triples? I can file bugs for rdflib and SWI-Prolog
semweb. Can someone improve the specification and/or test case?
Or... am I wrong and and is "no" the correct answer after all?
---
Best regards,
Wouter Beek.
Received on Wednesday, 28 June 2017 15:15:50 UTC