W3C home > Mailing lists > Public > public-rdf-comments@w3.org > November 2013

[RESOLVED] Re: Which characters are allowed in IRIREF in Turtle 2013?

From: Dave Beckett <dave@dajobe.org>
Date: Sat, 02 Nov 2013 23:51:44 -0700
Message-ID: <5275F280.303@dajobe.org>
To: Eric Prud'hommeaux <eric@w3.org>
CC: public-rdf-comments@w3.org
On 11/2/13 4:10 PM, Eric Prud'hommeaux wrote:
> * Dave Beckett <dave@dajobe.org> [2013-03-04 09:40-0800]
>> http://www.w3.org/TR/2013/CR-turtle-20130219/#grammar-production-IRIREF
>>
>> What characters (Unicode code points) are allowed in an IRIREF in turtle?
>>
>> the IRIREF grammar rule is:   [^#x00-#x20<>\"{}|^`\] | UCHAR)
>>
>> implies that for example U+007F is allowed since it's not in the
>> escaped range.  Taking a look at the IRI RFC 3987 it has a more
>> restricted range and taking the example U+007F is not allowed.
>> There are many other Unicode codepoints that are not allowed.
>>
>> See the RFC987 rule 'ipchar' and it's expansion to 'ucschar'
>>
>> This rule should probably be completed so either it lists all the
>> allowed characters or lists all the excluded ones (if the [^...]
>> form remains)
> 
> The actual RFC3987 grammar is quite complex and the WG was unwilling
> to copy that grammar (which is not LALR(1)/LL(1)) into the Turtle
> grammar. I proposed some changes to surface in Turtle some of the
> characters prohibited by RFC3987 but the WG never reached consensus
> on that.
> <http://lists.w3.org/Archives/Public/public-rdf-wg/2013Mar/thread#msg244>
> 
> Any measure to restrict IRIREF would be incomplete. Finally, on 30
> Oct, we resolved "WG will not copy the RFC3987 production for IRIs
> into Turtle"
> <https://www.w3.org/2013/meeting/rdf-wg/2013-10-30#resolution_5>
> 
> If you feel that we've addressed this comment, please reply with
> "[RESOLVED]" in the subject.
> 
> 
>> Dave
>>
>>
> 
Received on Sunday, 3 November 2013 06:52:11 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:29:58 UTC