Re: Proposed fixed version of N-Triples https://www.w3.org/TR/n-triples/ Section 7 from Jan Wielemaker on 2017-07-03 (public-rdf-comments@w3.org from July 2017)

From: Jan Wielemaker <J.Wielemaker@vu.nl>
Date: Mon, 3 Jul 2017 16:24:17 +0200
To: Richard Cyganiak <richard@cyganiak.de>, Wouter Beek <wouter@triply.cc>
CC: Gregg Kellogg <gregg@greggkellogg.net>, Dan Brickley <danbri@google.com>, Ivan Herman <ivan@w3.org>, public-rdf-comments Comments <public-rdf-comments@w3.org>, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Message-ID: <560f2baa-bb10-4884-92b9-c35b0e9287a6@vu.nl>

On 07/03/2017 04:08 PM, Richard Cyganiak wrote:
>
>> On 3 Jul 2017, at 14:54, Wouter Beek <wouter@triply.cc
>> <mailto:wouter@triply.cc>> wrote:
>>
>> Hi Richard, others,
>>
>>     The N-Triples document defines two languages: “N-Triples” and
>>     “Canonical N-Triples”. The latter requires a single space between
>>     RDF terms and does not permit comments, and is reasonably
>>     well-suited to processing with line-based text tools. Producers
>>     are encouraged to produce Canonical N-Triples.
>>
>>
>> True, but for a data consumer it is not possible to determine whether
>> a document is formatted in canonical or in non-canonical N-Triples
>> (except by fully parsing the document).
>
> Yes. To find out whether a document is formatted in some language it is
> necessary to fully parse the document. That’s true of every language.

Not really. Actually, there have been nice competitions for writing
programs that are valid syntax in multiple languages. If they do not
have the same semantics you'd rather know what language it was meant to
be. If the language it is written in is declared somewhere (file
extension, media type, header declaration, etc.) you can process the
input according to that declaration.  True, you should normally check
that the document is indeed valid syntax and raise an error if this is
not the case.

>> Canonical and non-canonical N-Triples advertise the same Media Type in
>> HTTP Content-Type headers and have the same extension in file names.
>> It's nice when data publishers use the canonical N-Triples format, but
>> since the data consumer cannot anticipate that this is actually the
>> case, this does not make the situation easier for her in practice.
>
> My experience is exactly the opposite: When publishers and tools produce
> Canonical N-Triples, it makes my situation as a data consumer very much
> easier in practice, while in theory it doesn’t make a difference.

Good, but how do you even get a hint it may be canonical N-Triples?  I
guess because it is in the accompagnying docs?  So this is useful metadata
that should have been machine readable IMO.

 Cheers --- Jan

Received on Monday, 3 July 2017 14:24:53 UTC