Re: N3 and N-Triples (was: RDF in HTML: Approaches) from Dave Beckett on 2002-06-07 (www-rdf-interest@w3.org from June 2002)

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Date: Fri, 07 Jun 2002 12:23:29 +0100
To: Joshua Allen <joshuaa@microsoft.com>
cc: www-rdf-interest <www-rdf-interest@w3.org>
Message-ID: <604.1023449009@tatooine.ilrt.bris.ac.uk>

>>>Joshua Allen said:
> > <uri-refs> - except for handing \\ and \"  It was carefully
> > designed so that they can be compared as byte-sequences, since for
> > each Unicode character there is only one way to escape it.
> 
> I have no doubt it was carefully designed, but one could question the
> wisdom of spending RDF WG resources on carefully designing yet another
> Unicode canonicalization scheme....

This format does not address (XML) canonicalization. that in the XML
syntax.  N-Triples has demonstrated many times in the WG to be
minimal, appropriate and sufficient both for defining the mappings
from RDF/XML to RDF graph and for working out issues with the RDF
model theory in terms of entailment.

With respect to the internationalisation issues, RDF Core has spent a
lot of time considering these aspects for both XML canonicalisation,
the proposed Character Model WD and the IRIs.  See our extensive
discussions on the WG mailing list, discussions with I18N and C14N
groups and lists and our comments on the charmod draft and
canonicalization specifications.  We are folding an appropriate
amount of this work into our WDs; where we see appropriate as not
trying to go beyond best current practice too much.

> ...  I have no doubts that the WG would be
> very responsive to bug reports and would work with anyone who contacted
> them with problems in the n-triples format.  But, if I were the test
> lead for a platforms project using RDF, I would want more than just the
> assurance and good will of someone who worked on the spec.  I would
> want:
> 	* Proof that the format was widely used, tested, and reliable
> for transporting international data
> 	* Endorsement from some groups who work specifically on
> international data processing

This is not an end-user format; it does not address that and does not
address the I18N such as those raised in Charmod.  The RDF/XML
syntax, along with the use of early form normalization in normal form
C, Exclusive XML canonicalization along with all the rest of XML,
addresses that.  N-Triples allows the encoding of such normalized
XML, possibly IRIs (we don't emphasise that much) but as we are
talking of early normalization, this format doesn't need to describe
it.

I agree we don't have a lot experience from experts with
international data processing on the group.  But hey, the standard
serialization format is XML, and RDF uses that.

> Considering that there already *are* formats which meet those two
> criteria, and since n-triples at this point meets neither, it would be
> hard for me to justify choosing n-triples as a testing format.

It takes probably 30 minutes to write something that handles
N-Triples in any language (NTC is ~700 lines of C++) and as I say,
really works for the job it was designed for, RDF syntax<>RDF graph
and writing RDF graphs.  

*If* we were designing an end simple user format, N-Triples would have
 alternate abbreviation mechanisms, consider encoding issues, and so on.
 
> (And FWIW, if I were developing conformance tests for vendors to use to
> determine conformance to the spec, these issues would, IMO be even more
> important.)

You seem to have a problem with simple text formats!


Here are some of our test cases related to I18N and C14:

  http://www.w3.org/2000/10/rdf-tests/rdfcore/rdf-charmod-literals/

  http://www.w3.org/2000/10/rdf-tests/rdfcore/rdf-charmod-uris/

(some aren't yet approved)

Dave

Received on Friday, 7 June 2002 07:23:31 UTC