Re: N3 and N-Triples (was: RDF in HTML: Approaches) from Dave Beckett on 2002-06-06 (www-rdf-interest@w3.org from June 2002)

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Date: Fri, 07 Jun 2002 00:28:16 +0100
To: Joshua Allen <joshuaa@microsoft.com>
cc: www-rdf-interest <www-rdf-interest@w3.org>
Message-ID: <16489.1023406096@tatooine.ilrt.bris.ac.uk>

>>>Joshua Allen said:
> 
> This is +1 to Patrick's n-triples as RDF-Subset

We considered this in RDF Core at the start of the WG (before Patrick
joined) but if I recall correctly, decided it was a new XML syntax,
and the charter of the WG forbids it, and see below re simplicity.

> I think the internationalization issues go away if you just serialize
> the triples as XML (i.e. RDFSubset+XML as Patrick just proposed).
> The non-XML n-triples syntax has various ways of escaping things so
> that you can stuff Unicode into a 7-bit file, escaping for
> whitespace, since whitespace is also used to distinguish items, and
> so on -- so the n-triples files get really ugly with anything other
> than Western European stuff.

Some of the issues will go away, but there are a load of XML ones to
add about canonical XML forms, normalisation, ..

> And FWIW, I think this is a major strike *against* the current
> n-triple serialization as a good test tool.  In order to gain broad
> acceptance, RDF will have to handle languages like Chinese at least
> as good as XML (and XML is no paragon).  Imagine merging and testing
> graphs of mixed Chinese, Arabic, and other Unicode languages.  If I
> were the test lead for such a project, I would be more worried about
> debugging the n-triples syntax than my program.  At least if the data
> is stuffed in an XML file, I know that there are a wide range of
> parsers available that have got it pretty much right.

N-Triples parsing, for testing (i.e do the ntriples match) is quite
simple.  In particular, you don't have to parse the "strings" and
<uri-refs> - except for handing \\ and \"  It was carefully
designed so that they can be compared as byte-sequences, since for
each Unicode character there is only one way to escape it.

And the RDF test suite comes with a tiny C++ program that implements
it for you linked from the spec:
  http://www.w3.org/TR/rdf-testcases/#tc_running

NTC: http://www.w3.org/2000/10/rdf-tests/rdfcore/utils/ntc/

I'm not against an XML test format, however experience has shown that
simple text formats have been easy for people to read, discuss and
approve. In future, maybe something like how RELAX NG Compact (text)
relates to RELAX NG (XML syntax).

I'd hope that some future group that had a charter to improve on RDF
would include new XML formats.  I've got several ideas myself on
that, given my work on expressing the current syntax, implementing an
rdf/xml parser and serializer and using the syntax.

Dave

Received on Thursday, 6 June 2002 19:28:20 UTC