- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Wed, 27 Mar 2013 09:00:35 -0400
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: public-rdf-wg@w3.org
* Andy Seaborne <andy.seaborne@epimorphics.com> [2013-03-27 08:40+0000] > > > On 26/03/13 21:01, Eric Prud'hommeaux wrote: > >The Turtle spec says that parsing the PNAME_NS and PNAME_LN terminals > >produces an IRI as defined in RDF Concepts. > > http://www.w3.org/TR/turtle/#handle-IRI > > http://www.w3.org/TR/turtle/#handle-PNAME_LN > > http://www.w3.org/TR/2013/WD-rdf11-concepts-20130115/#dfn-iri > >RDF Concepts says that IRI is "a Unicode string [UNICODE] that > >conforms to the syntax defined in RFC 3987 [RFC3987]." In sum, we > >provide a pretty liberal grammar and then point to a hilariously > >complex grammar, but don't expect anyone to enforce it. > > Don't we? :-) I may be wrong. You could create some negative IRI evaluation tests to make the conversation more concrete. I'm not psyched to up the bar, but maybe others are. I understand that Jena warns about IRIs outside of NFC, which pushes users to produce IRIs which are more predictable, but does it 3987:validate IRIs? positive: my-scheme://::@-._~ :1?/?#/?%00日本 negative: my-scheme://:@-._~ :1?/?#/?%00日本 > >Comments c23 "IRIREF production less restrictive than RFC3987" and c26 > >"PN_CHARS_BASE outside of IRI range" indicate some frustration with our > >grammar which permits characters which aren't allowed anywhere in IRIs. > > > > <http://www.w3.org/2011/rdf-wg/wiki/Turtle_Candidate_Recommendation_Comments#c23> > > <http://www.w3.org/2011/rdf-wg/wiki/Turtle_Candidate_Recommendation_Comments#c26> > > > >One approach would be to trim the bogus chars off of PN_CHARS_BASE and > >include a note below the grammer which points directly at 3987 and > >states that the IRIs constructed by either IRIREF or PNAME_LN are 3987 > >IRIs. This would would supplement the note about valid literal ranges > >proposed to address c27. > > > > <http://www.w3.org/2011/rdf-wg/wiki/Turtle_Candidate_Recommendation_Comments#c27> > > <http://www.w3.org/mid/20130324145153.GN14139@w3.org> > > > >I have spoken to those acting as W3C director. They consider this to > >be a clarification and nothing that would require another LC. > > The PN_CHARS_BASE rule is the same as the XML rule for NameStartChar > without the ':' > > If we alter PN_CHARS_BASE won't there be ways to write in RDF/XML > something that can't be written in the Turtle grammar? Sure - it > may lead to a illegal IRI but it means we already depend on IRI > checking for that if it is "not enforced" we have IRI strings via > RDF/XML that can't be written in the similar way in Turtle. Fair point, worth considering. Let's look at an RDF/XML doc which passes the grammar but doesn't produce a valid RDF graph: <?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://www.w3.org/￾"> <!-- ends with U+FFFE --> <dc:title>World Wide Web Consortium</dc:title> </rdf:Description> </rdf:RDF> You could write it in Turtle, but you'd need an escape: Illegal by this proposal: @prefix dc: <http://purl.org/dc/elements/1.1/> . <http://purl.org/dc/elements/1.1/￾> dc:title "World Wide Web Consortium" . Legal by this proposal: @prefix dc: <http://purl.org/dc/elements/1.1/> . <http://purl.org/dc/elements/1.1/\uFFFE> dc:title "World Wide Web Consortium" . I'd argue that Turtle and SPARQL shouldn't be beholden to reproducing invalid RDF graphs anyways. (Well, I'm sure we both think that, but our intuitions about optimal the degree may differ.) > (I'm not adverse to a change - including filing a SPARQL errata - > but we do have to fit everything together. SPARQL 1.0 took the > character ranges because of RDF/XML.) And that made a lot of sense. It wasn't until the comments that I ever thought to perform the arithmetic to surface the low-hanging 3987 restrictions. If we adopt this quickly, we might be able to set a record for the shortest time at REC without an errata. Apart from competitive interests, there's no real hurry. I was trying to get through the test-related comments ASAP, this kind bubbled up with them. -- -ericP
Received on Wednesday, 27 March 2013 13:01:05 UTC