- From: Alan Wu <ALAN.WU@oracle.com>
- Date: Tue, 23 Aug 2011 10:29:51 -0700 (PDT)
- To: <gavin@topquadrant.com>
- Cc: <public-rdf-wg@w3.org>
Hi Gavin, If that file is claimed to be N-TRIPLE. I would say it is a bug ;) Thanks, Zhe ----- Original Message ----- From: gavin@topquadrant.com To: alan.wu@oracle.com Cc: public-rdf-wg@w3.org Sent: Monday, August 22, 2011 2:05:11 PM GMT -08:00 US/Canada Pacific Subject: Re: looks like it should be turtle Re: Oracle's stand regarding N-TRIPLES On Mon, Aug 22, 2011 at 1:49 PM, Zhe Wu <alan.wu@oracle.com> wrote: > Hi Gavin, > > I just did a quick test against that > > http://id.loc.gov/vocabulary/iso639-1/nn.nt > > If we read the file as NTRIPLES, then raptor complains. > > raptor2-1.9.0/utils/rapper -i ntriples ./tests/iso639-1-nn.nt -o ntriples > > /tmp/rapper.nt_readAsNTRIPLES > lt-rapper: Parsing URI file:///...iso639-1-nn.nt with parser ntriples > lt-rapper: Serializing with serializer ntriples > lt-rapper: Error - URI file:///...iso639-1-nn.nt:5 column 101 - > Non-printable ASCII character 195 (0xC3) found. Correct, raptor does not implement UTF-8 handling of N-Triples. > lt-rapper: Parsing returned 16 triples > > > If we read the file as Turtle, everything seems fine. > > raptor2-1.9.0/utils/rapper -i turtle ./tests/iso639-1-nn.nt -o ntriples > > /tmp/rapper.nt_readAsTurtle > lt-rapper: Parsing URI file:///...iso639-1-nn.nt with parser turtle > lt-rapper: Serializing with serializer ntriples > lt-rapper: Parsing returned 76 triples > > As far as I can tell, LOC is serving turtle. That filename is slightly > confusing. Nope, the mime type is clearly text/plain and if we look at the HTML version of that resource http://id.loc.gov/vocabulary/iso639-1/nn.html we see it naming the link N-Triples. Of course as you point out an N-Triples (UTF-8) file can be considered to be a subset of Turtle. --Gavin > > Thanks, > > Zhe > > > On 8/22/2011 11:53 AM, Gavin Carothers wrote: > > On Mon, Aug 22, 2011 at 11:14 AM, Zhe Wu <alan.wu@oracle.com> wrote: > > Hi Pat, > > Actually, no. It is just plain better for all but a tiny fraction of human > readers, anywhere on the planet. This tiny fraction includes some software > engineers. I personally will simply ignore any string that contains \u > escapes, and immediately cease using any software that shows them to me. And > I suspect that more people share my instincts than share yours. > > I don't think N-TRIPLES is an end user oriented format. It's originally > designed for Test cases as pointed out by Jeremy. It > happens to be used (quite well actually) by large-scale machine to machine > communication as pointed out by Richard. I would > dare say that the chance to see \u from a User Interface of a semantic web > application is very low. > > The chances of coming across UTF-8 N-Triples is rather high. > > http://id.loc.gov/vocabulary/iso639-1/nn.nt > > In fact all of the Library of Congress N-Triple documents are served > in a perfectly reasonable > > Content-type: text/plain; charset=UTF-8 > > If a vendor expects to work with the LOC Subject Headings or any other > ontology published by the LOC and wants to use N-Triples they will > need to support UTF-8. > > Cheers, > Gavin > > >
Received on Tuesday, 23 August 2011 17:30:36 UTC