- From: Gavin Carothers <gavin@topquadrant.com>
- Date: Mon, 22 Aug 2011 14:04:30 -0700
- To: Zhe Wu <alan.wu@oracle.com>
- Cc: public-rdf-wg@w3.org
On Mon, Aug 22, 2011 at 1:49 PM, Zhe Wu <alan.wu@oracle.com> wrote: > Hi Gavin, > > I just did a quick test against that > > http://id.loc.gov/vocabulary/iso639-1/nn.nt > > If we read the file as NTRIPLES, then raptor complains. > > raptor2-1.9.0/utils/rapper -i ntriples ./tests/iso639-1-nn.nt -o ntriples > > /tmp/rapper.nt_readAsNTRIPLES > lt-rapper: Parsing URI file:///...iso639-1-nn.nt with parser ntriples > lt-rapper: Serializing with serializer ntriples > lt-rapper: Error - URI file:///...iso639-1-nn.nt:5 column 101 - > Non-printable ASCII character 195 (0xC3) found. Correct, raptor does not implement UTF-8 handling of N-Triples. > lt-rapper: Parsing returned 16 triples > > > If we read the file as Turtle, everything seems fine. > > raptor2-1.9.0/utils/rapper -i turtle ./tests/iso639-1-nn.nt -o ntriples > > /tmp/rapper.nt_readAsTurtle > lt-rapper: Parsing URI file:///...iso639-1-nn.nt with parser turtle > lt-rapper: Serializing with serializer ntriples > lt-rapper: Parsing returned 76 triples > > As far as I can tell, LOC is serving turtle. That filename is slightly > confusing. Nope, the mime type is clearly text/plain and if we look at the HTML version of that resource http://id.loc.gov/vocabulary/iso639-1/nn.html we see it naming the link N-Triples. Of course as you point out an N-Triples (UTF-8) file can be considered to be a subset of Turtle. --Gavin > > Thanks, > > Zhe > > > On 8/22/2011 11:53 AM, Gavin Carothers wrote: > > On Mon, Aug 22, 2011 at 11:14 AM, Zhe Wu <alan.wu@oracle.com> wrote: > > Hi Pat, > > Actually, no. It is just plain better for all but a tiny fraction of human > readers, anywhere on the planet. This tiny fraction includes some software > engineers. I personally will simply ignore any string that contains \u > escapes, and immediately cease using any software that shows them to me. And > I suspect that more people share my instincts than share yours. > > I don't think N-TRIPLES is an end user oriented format. It's originally > designed for Test cases as pointed out by Jeremy. It > happens to be used (quite well actually) by large-scale machine to machine > communication as pointed out by Richard. I would > dare say that the chance to see \u from a User Interface of a semantic web > application is very low. > > The chances of coming across UTF-8 N-Triples is rather high. > > http://id.loc.gov/vocabulary/iso639-1/nn.nt > > In fact all of the Library of Congress N-Triple documents are served > in a perfectly reasonable > > Content-type: text/plain; charset=UTF-8 > > If a vendor expects to work with the LOC Subject Headings or any other > ontology published by the LOC and wants to use N-Triples they will > need to support UTF-8. > > Cheers, > Gavin > > >
Received on Monday, 22 August 2011 21:04:57 UTC