- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Wed, 20 Mar 2013 17:36:39 +0000
- To: public-rdf-wg@w3.org
The TTL has U+037E but ... PN_CHARS_BASE has a hole specifically for that [#x0370-#x037D] | [#x037F-#x1FFF] => not a legal char. Removing it (Greek question mark), I then get: WARN [line: 2, col: 43] Bad IRI: <http://a.example/AZaz???????????????????????> Code: 46/NOT_NFC in PATH: The IRI is not in Unicode Normal Form C. WARN [line: 2, col: 43] Bad IRI: <http://a.example/AZaz???????????????????????> Code: 47/NOT_NFKC in PATH: The IRI is not in Unicode Normal Form KC. WARN [line: 2, col: 43] Bad IRI: <http://a.example/AZaz???????????????????????> Code: 56/COMPATIBILITY_CHARACTER in PATH: TODO with or without the last char. > I poked around looking for composing characters in the PN_CHARS_BASE > character ranges. \u02ff MODIFIER LETTER LOW LEFT ARROW seemed like it > could be a culprit, but fileformat.info claims it's not in a combining > class. Likewise \ufffd REPLACEMENT CHARACTER > > There are a bunch of yet-unassigned characters which could be confusing > a vigilent IRI checkr. I've mapped those to the highest currently- > assigned characters in their respective range (per fileformat.info): > > \u037f 37e > \u1fff 1ffe > \u218f 2189 > \u2fef 2fd5 > \ud7ff d7fb > \ufdcf fdc7 > \U000effff e01ef > > attached is a variant of > localName_with_PN_CHARS_BASE_character_boundaries.{nt,ttl} > with the values substituted. (I pass this modified test so there > shouldn't be any typos in it.) If it still doesn't work, try chopping > off the last character 'cause it's a variation selector which ostensibly > is NF{,K}{C,D} valid, but may not have been when jjc wrote your checker. > >
Received on Wednesday, 20 March 2013 17:37:11 UTC