- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Wed, 28 Jun 2017 18:38:44 -0700
- To: Wouter Beek <w.g.j.beek@vu.nl>, SW-forum Web <semantic-web@w3.org>
Is there any place in N-Triples where inter-terminal white space is required? I don't think so. As triples are self-terminating, white-space is not required after a triple. Comments consume the remainder of a line but leave any EOL in place. They can thus only occur where an EOL is allowed and thus cannot occur inside a triple. IRIREFs and STRING_LITERAL_QUOTEs are self-terminating so white space is never required after them. Literals with a datatype end with an IRIREF so white space is never required after them. Language tags are not self-terminating but the next token after a language tag has to be a '.' which cannot be part of the language tag. So white space is never required after a language tag and thus white space is never required after a literal with a language tag. All variants and pieces of literals are covered above so white space is never required after a literal. BLANK_NODE_LABELs are not self-terminating but the next token after a blank node is either an IRIREF or a '.'. BLANK_NODE_LABELs cannot end with a '.'. IRIREFs start with a '<' which cannot be part of a BLANK_NODE_LABEL. So white space is never required after a BLANK_NODE_LABEL. That covers all the cases, I think, so there is no place where inter-terminal white space is required in N-Triples. Peter F. Patel-Schneider Nuance Communications On 06/28/2017 08:14 AM, Wouter Beek wrote: > Hi Semantic Web community, > > Is [1] a valid N-Triples 1.1 statement (notice there are no spaces > between the terms)? I'm able to find evidence that says "yes" and > able to find evidence that says "no". I think "yes" is correct here, > but the specification document is able to trick a casual reader into > believing "no", several SotA tools currently implement according to > "no", and the test case is ambiguous (see below). > > [1] <x:y><x:y><x:y>. > > No, [1] is not valid: > > * The N-Triples 1.1 specification says that "The simplest triple > statement is a sequence of (subject, predicate, object) terms, > separated by whitespace and terminated by '.' after each triple." > Under a certain interpretation of the word 'simplest' this means > that whitespace is required, because if a statement S that is > expressed by a triple T can also be expressed by a triple T' which > only differs from T in that is contains no whitespaces, it follows > that T is not the simplest triple expressing S. > > * The N-Triples specification defines canonical N-Triples as "The > whitespace following subject, predicate, and object must be a > single space, (U+0020)." This implies that whitespace indeed > follows the subject, predicate, and object terms ("the whitespace" > does not refer to the empty string). > > * rdflib 4.2.2 gives an error when parsing [1]. > > * SWI-Prolog Semweb library 7.5.10 gives an error when parsing [1]. > > * There is a test case called `#minimal_whitespace' which has value > `rdft:approval rdft:Proposed', and according to the test suite > vocabulary this means that the test is "proposed but not > approved", so minimal whitespace is a proposal that is not yet > part of N-Triples 1.1. > > * It should always be possible to easily split an N-Triple statement > based on whitespace characters. > > Yes, [1] is valid: > > * The N-Triples 1.1 specification clearly states that "triples are a > sequence of RDF terms representing the subject, predicate and > object of an RDF Triple. These may be separated by white space > (spaces U+0020 or tabs U+0009)." > > * Serd 0.26.0 correctly parses [1]. > > * Jena 3.0 correctly parses [1]. > > * Raptor 2.0.15 correctly parses [1]. > > * There is a test case called `#minimal_whitespace' which is of type > `rdft:TestNTriplesPositiveSyntax'. > > * The N-Triples 1.1 specification says that "White space (tab U+0009 > or space U+0020) is used to separate two terminals which would > otherwise be (mis-)recognized as one terminal." This may imply > that whitespace is not required for separating non-terminals? > > What can we do to clear up the situation regarding the use of > whitespace in N-Triples? I can file bugs for rdflib and SWI-Prolog > semweb. Can someone improve the specification and/or test case? > Or... am I wrong and and is "no" the correct answer after all? > > --- > Best regards, > Wouter Beek. >
Received on Thursday, 29 June 2017 01:39:23 UTC