- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Wed, 20 Mar 2013 17:36:39 +0000
- To: public-rdf-wg@w3.org
The TTL has U+037E but ...
PN_CHARS_BASE has a hole specifically for that
[#x0370-#x037D] | [#x037F-#x1FFF]
=> not a legal char.
Removing it (Greek question mark), I then get:
WARN [line: 2, col: 43] Bad IRI:
<http://a.example/AZaz???????????????????????> Code: 46/NOT_NFC in PATH:
The IRI is not in Unicode Normal Form C.
WARN [line: 2, col: 43] Bad IRI:
<http://a.example/AZaz???????????????????????> Code: 47/NOT_NFKC in
PATH: The IRI is not in Unicode Normal Form KC.
WARN [line: 2, col: 43] Bad IRI:
<http://a.example/AZaz???????????????????????> Code:
56/COMPATIBILITY_CHARACTER in PATH: TODO
with or without the last char.
> I poked around looking for composing characters in the PN_CHARS_BASE
> character ranges. \u02ff MODIFIER LETTER LOW LEFT ARROW seemed like it
> could be a culprit, but fileformat.info claims it's not in a combining
> class. Likewise \ufffd REPLACEMENT CHARACTER
>
> There are a bunch of yet-unassigned characters which could be confusing
> a vigilent IRI checkr. I've mapped those to the highest currently-
> assigned characters in their respective range (per fileformat.info):
>
> \u037f 37e
> \u1fff 1ffe
> \u218f 2189
> \u2fef 2fd5
> \ud7ff d7fb
> \ufdcf fdc7
> \U000effff e01ef
>
> attached is a variant of
> localName_with_PN_CHARS_BASE_character_boundaries.{nt,ttl}
> with the values substituted. (I pass this modified test so there
> shouldn't be any typos in it.) If it still doesn't work, try chopping
> off the last character 'cause it's a variation selector which ostensibly
> is NF{,K}{C,D} valid, but may not have been when jjc wrote your checker.
>
>
Received on Wednesday, 20 March 2013 17:37:11 UTC