- From: Dave Beckett <dave@dajobe.org>
- Date: Sat, 02 Nov 2013 23:51:44 -0700
- To: Eric Prud'hommeaux <eric@w3.org>
- CC: public-rdf-comments@w3.org
On 11/2/13 4:10 PM, Eric Prud'hommeaux wrote: > * Dave Beckett <dave@dajobe.org> [2013-03-04 09:40-0800] >> http://www.w3.org/TR/2013/CR-turtle-20130219/#grammar-production-IRIREF >> >> What characters (Unicode code points) are allowed in an IRIREF in turtle? >> >> the IRIREF grammar rule is: [^#x00-#x20<>\"{}|^`\] | UCHAR) >> >> implies that for example U+007F is allowed since it's not in the >> escaped range. Taking a look at the IRI RFC 3987 it has a more >> restricted range and taking the example U+007F is not allowed. >> There are many other Unicode codepoints that are not allowed. >> >> See the RFC987 rule 'ipchar' and it's expansion to 'ucschar' >> >> This rule should probably be completed so either it lists all the >> allowed characters or lists all the excluded ones (if the [^...] >> form remains) > > The actual RFC3987 grammar is quite complex and the WG was unwilling > to copy that grammar (which is not LALR(1)/LL(1)) into the Turtle > grammar. I proposed some changes to surface in Turtle some of the > characters prohibited by RFC3987 but the WG never reached consensus > on that. > <http://lists.w3.org/Archives/Public/public-rdf-wg/2013Mar/thread#msg244> > > Any measure to restrict IRIREF would be incomplete. Finally, on 30 > Oct, we resolved "WG will not copy the RFC3987 production for IRIs > into Turtle" > <https://www.w3.org/2013/meeting/rdf-wg/2013-10-30#resolution_5> > > If you feel that we've addressed this comment, please reply with > "[RESOLVED]" in the subject. > > >> Dave >> >> >
Received on Sunday, 3 November 2013 06:52:11 UTC