W3C home > Mailing lists > Public > public-rdf-wg@w3.org > March 2013

Re: claimed completion on "ACTION-233: Publish the consolidated test suite"

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Wed, 20 Mar 2013 17:36:39 +0000
Message-ID: <5149F3A7.3080200@epimorphics.com>
To: public-rdf-wg@w3.org
The TTL has U+037E but ...

PN_CHARS_BASE has a hole specifically for that

[#x0370-#x037D] | [#x037F-#x1FFF]

=> not a legal char.

Removing it (Greek question mark), I then get:

WARN  [line: 2, col: 43] Bad IRI: 
<http://a.example/AZaz???????????????????????> Code: 46/NOT_NFC in PATH: 
The IRI is not in Unicode Normal Form C.
WARN  [line: 2, col: 43] Bad IRI: 
<http://a.example/AZaz???????????????????????> Code: 47/NOT_NFKC in 
PATH: The IRI is not in Unicode Normal Form KC.
WARN  [line: 2, col: 43] Bad IRI: 
<http://a.example/AZaz???????????????????????> Code: 

with or without the last char.

> I poked around looking for composing characters in the PN_CHARS_BASE
> character ranges. \u02ff MODIFIER LETTER LOW LEFT ARROW seemed like it
> could be a culprit, but fileformat.info claims it's not in a combining
> class. Likewise \ufffd REPLACEMENT CHARACTER
> There are a bunch of yet-unassigned characters which could be confusing
> a vigilent IRI checkr. I've mapped those to the highest currently-
> assigned characters in their respective range (per fileformat.info):
>      \u037f   37e
>      \u1fff  1ffe
>      \u218f  2189
>      \u2fef  2fd5
>      \ud7ff  d7fb
>      \ufdcf  fdc7
> \U000effff e01ef
> attached is a variant of
>    localName_with_PN_CHARS_BASE_character_boundaries.{nt,ttl}
> with the values substituted. (I pass this modified test so there
> shouldn't be any typos in it.) If it still doesn't work, try chopping
> off the last character 'cause it's a variation selector which ostensibly
> is NF{,K}{C,D} valid, but may not have been when jjc wrote your checker.
Received on Wednesday, 20 March 2013 17:37:11 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:04:26 UTC