Re: RDF-ISSUE-123 (localName chars): PN_CHARS_BASE permits up to U+EFFFF but RFC-3987 stops at U+EFFFD [RDF Turtle]

Just an FYI, there were some more tests that had chracters outside of the limit allowed by RFC-3987, in particular:

localName_with_PN_CHARS_BASE_character_boundaries.ttl
localName_with_assigned_nfc_PN_CHARS_BASE_character_boundaries.ttl
localName_with_assigned_nfc_bmp_PN_CHARS_BASE_character_boundaries.ttl
localName_with_nfc_PN_CHARS_BASE_character_boundaries.ttl

Used characters after #FFEF. I took the liberty of updating the test files accordingly.

Gregg Kellogg
gregg@greggkellogg.net

On Mar 24, 2013, at 4:26 AM, Andy Seaborne <andy.seaborne@epimorphics.com> wrote:

> 
> 
> On 24/03/13 05:40, RDF Working Group Issue Tracker wrote:
>> RDF-ISSUE-123 (localName chars): PN_CHARS_BASE permits up to U+EFFFF but RFC-3987 stops at U+EFFFD [RDF Turtle]
>> 
>> http://www.w3.org/2011/rdf-wg/track/issues/123
>> 
>> Raised by: Eric Prud'hommeaux
>> On product: RDF Turtle
>> 
>> Gregg Kellogg pointed out in http://www.w3.org/mid/49EB390E-BCA6-401B-98EC-F4DD6A44AD0B@greggkellogg.net that Turtle's localNames overrun RFC-3987 iri by two characters. These two Unicode characters are reserved for process-internal use and thusly don't make sense in a global identification scheme.
>> 
>> Should we shave PN_CHARS_BASE down to [#x10000-#xEFFFF]? If this is a bug fix, can we do that without another LC?
>> 
>> 
>> 
> 
> I prefer Gregg's solution of making the the IRIs in tests legal by RFC 3987.  The grammar may be wider - it is anyway because we don't include an RFC 3986/3987 parser (or scheme specific rules).
> 
> 	Andy
> 

Received on Sunday, 24 March 2013 19:34:36 UTC