Re: N-triples white space question

On 18 May 2012, at 11:27, Andy Seaborne wrote:
> Maybe we could define a canonical form of N-triples:

+1, this would be very useful.

> . No comments.
> . No blank lines.
> . CR+LF
> . Single space between S/P, P/O.
>    (a raw tab is also good - it can't appear in a valid literal)

My subjective impression is that single space is very common in existing N-Triples files. The more the canonical form resembles common practice, the better.

> . No use of \u or \U

+1! Very important. (Although common practice at the moment would dictate: “randomly fuck up Unicode characters”)

> . Resolved IRIs
>    avoid <http://example/a/./b/../c> or <http://example.org:80/a>

The formal way to state this is: “Only IRIs that are normalized according to Section 5 of [IRI].” A link to this Note in RDF Concepts would help to explain what this means:
http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#note-iri-interop

> . Last line has a CR+LF

I'd add:

  • No additional HWS before or after CR+LF
  • No WS between O and triple-ending Period (although “single space” might be closer to current common practice and would work equally well; it's just ugly to my eyes)

Richard

Received on Friday, 18 May 2012 11:32:42 UTC