Re: Are spaces allowed between terms in N-Triples 1.1?

> On 29 Jun 2017, at 16:23, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
> 
> Even if spaces are optional in N-Triples, N-Triples documents are still easy
> to parse, so I don't see that the quote you put forward has anything to say
> about white space in N-Triples.

Canonical N-Triples for example allow you to count predicates this easy:

cut -d' ' -f2 < input.nt | sort | uniq -c

One could obviously split at '<' or '>' and depending on BNode take the first or second of those, oh and objects become even more fun...

So sorry, but i don't think that is "still easy".


> The second and third examples you present are not valid in N-Triples,
> regardless of white space, so they certainly don't have anything to say about
> white space in N-Triples.

You're right, sorry for sloppily jumping ahead and upgrading the discussion to turtle...

An example for the above is:
_:a <b> "c" .
_:d<e>"f".


> It might have been better at the beginning to require white space after the
> subject, predicate, and object of a triple in N-Triples, but given that that
> wasn't required I don't see that the costs of requiring them now are worth any
> minor benefits in human readability that might ensue.  Note that nothing
> (except the badly written grammar for N-Triples) prevents tools from putting
> single spaces after subjects, predicates, and objects.

I agree with the cost consideration, but i never actually saw nt/ttl serialized without whitespace.
Can you point me to a serializer that does this?

IMO this whole discussion is mostly a theoretical one, doesn't really lead us anywhere and could easily be concluded:
I wouldn't update the spec prohibiting parsing no whitespace nt, as this would be backwards incompatible.
I'd however slightly update the grammar prelude to more explicitly allow (and encourage) whitespace after each of the terms (even if not totally necessary) by replacing the following sentence:

> White space (tab U+0009 or space U+0020) is used to separate two terminals which would otherwise be (mis-)recognized as one terminal.

with:

> White space (tab U+0009 or space U+0020) SHOULD be used to separate terminals and MUST be used to separate two terminals which would otherwise be (mis-)recognized as one terminal.


Best,
Jörn

Received on Thursday, 29 June 2017 18:35:15 UTC