- From: Wouter Beek <w.g.j.beek@vu.nl>
- Date: Wed, 28 Jun 2017 17:14:33 +0200
- To: SW-forum Web <semantic-web@w3.org>
Hi Semantic Web community, Is [1] a valid N-Triples 1.1 statement (notice there are no spaces between the terms)? I'm able to find evidence that says "yes" and able to find evidence that says "no". I think "yes" is correct here, but the specification document is able to trick a casual reader into believing "no", several SotA tools currently implement according to "no", and the test case is ambiguous (see below). [1] <x:y><x:y><x:y>. No, [1] is not valid: * The N-Triples 1.1 specification says that "The simplest triple statement is a sequence of (subject, predicate, object) terms, separated by whitespace and terminated by '.' after each triple." Under a certain interpretation of the word 'simplest' this means that whitespace is required, because if a statement S that is expressed by a triple T can also be expressed by a triple T' which only differs from T in that is contains no whitespaces, it follows that T is not the simplest triple expressing S. * The N-Triples specification defines canonical N-Triples as "The whitespace following subject, predicate, and object must be a single space, (U+0020)." This implies that whitespace indeed follows the subject, predicate, and object terms ("the whitespace" does not refer to the empty string). * rdflib 4.2.2 gives an error when parsing [1]. * SWI-Prolog Semweb library 7.5.10 gives an error when parsing [1]. * There is a test case called `#minimal_whitespace' which has value `rdft:approval rdft:Proposed', and according to the test suite vocabulary this means that the test is "proposed but not approved", so minimal whitespace is a proposal that is not yet part of N-Triples 1.1. * It should always be possible to easily split an N-Triple statement based on whitespace characters. Yes, [1] is valid: * The N-Triples 1.1 specification clearly states that "triples are a sequence of RDF terms representing the subject, predicate and object of an RDF Triple. These may be separated by white space (spaces U+0020 or tabs U+0009)." * Serd 0.26.0 correctly parses [1]. * Jena 3.0 correctly parses [1]. * Raptor 2.0.15 correctly parses [1]. * There is a test case called `#minimal_whitespace' which is of type `rdft:TestNTriplesPositiveSyntax'. * The N-Triples 1.1 specification says that "White space (tab U+0009 or space U+0020) is used to separate two terminals which would otherwise be (mis-)recognized as one terminal." This may imply that whitespace is not required for separating non-terminals? What can we do to clear up the situation regarding the use of whitespace in N-Triples? I can file bugs for rdflib and SWI-Prolog semweb. Can someone improve the specification and/or test case? Or... am I wrong and and is "no" the correct answer after all? --- Best regards, Wouter Beek.
Received on Wednesday, 28 June 2017 15:15:50 UTC