- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Fri, 18 May 2012 14:08:26 +0100
- To: public-rdf-wg@w3.org
On 18/05/12 13:12, Eric Prud'hommeaux wrote: > * Richard Cyganiak<richard@cyganiak.de> [2012-05-18 12:35+0100] >> >> On 18 May 2012, at 11:34, Eric Prud'hommeaux wrote: >>> Does the existing body of N-Triples permit a grammar with no default whitespace rules? >>> >>> triples: triple (LF triple)* LF? >>> triple: subject HWS predicate HWS object '.' >>> >>> I.e, do all the N-Triples out there look like "<s> <p> <o>."? >> >> This is what N-Triples as currently defined requires. Isn't that sufficient? >> >> ntripleDoc ::= line* >> line ::= ws* ( comment | triple )? eoln >> triple ::= subject ws+ predicate ws+ object ws* '.' ws* >> ws ::= space | tab >> eoln ::= cr | lf | cr lf > > I was just interested to see how much your SHOULD: > [[ > * Richard Cyganiak<richard@cyganiak.de> [2012-05-18 11:06+0100] >> I would even go one step further and add some SHOULD-level guidance on where to put what whitespace. Perhaps something like: exactly one space between s and p; exactly one space between p and o; no WS before or after the period; no WS at > the start of a line; CR+LF as EOL. > ]] > could be turned into a MUST. I don't think a MUST is a good idea, partially because it's too late, but also despite being a dump format, it's not pure binary. Blank lines and comments do have a roll here and the CR+LF is a mild inconvenience in some text tools. There is variance in IRIs so from that point alone, NT has variations enough to stop blindly processing with line-based tools. I've seen the :80 thing in messy data. processing based on appearance needs an extra step to be safe at scale (i.e. not need checking afterwards). What a canonical form is good for is as a target for a simple tools to process and output. Hopefully, then tool makers will provide it by user demand. Andy > > >> Richard >> >> >> >>> I note that Oracle has been vigilent about preserving backwards-compatibility. Souri, do you have a sense of what Oracle has been using? >>> >>>> I also note that RDF 2004 N-Triples allows comments (only at the start of a line). This makes sense for the use as a test case format, but not much sense for the use as a dump format. >>>> >>>> Best, >>>> Richard >>>> >>>> >>>> [1] http://www.w3.org/TR/rdf-testcases/#ntriples >>>> >>>> >>>> >>>> On 18 May 2012, at 10:04, Andy Seaborne wrote: >>>> >>>>> Gavin, Eric, >>>>> >>>>> rdf-turtle says: >>>>> >>>>> [1] ntriplesDoc ::= (triple)? (EOL triple)* (EOL)? >>>>> [2] triple ::= subject predicate object '.' >>>>> [8] EOL ::= ([#xD#xA])+ >>>>> >>>>> What are the white space rules? >>>>> >>>>> Does it inherit white space processing from the rest of Turtle? Comments seem to come from Turtle. >>>>> >>>>> If it does not inherit white space rules, >>>>> what about horizontal white space inside triples? >>>>> >>>>> If it does inherit white space rules, >>>>> that includes newlines within triples between S/P or P/O. >>>>> >>>>> The simplest solution is to add text in section 12.3 to say that horizontal white space outside tokens is discarded (which is different to Turtle). >>>>> >>>>> Andy >>>>> >>>> >>>> >>> >>> -- >>> -ericP >>> >> >
Received on Friday, 18 May 2012 13:08:59 UTC