- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Wed, 20 Mar 2013 09:59:30 -0400
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: public-rdf-wg@w3.org
- Message-ID: <20130320135924.GC9440@w3.org>
* Andy Seaborne <andy.seaborne@epimorphics.com> [2013-03-20 10:31+0000]
>
>
> On 20/03/13 07:12, Eric Prud'hommeaux wrote:
> >I pushed the ~100 atomic tests into the test-ttl test suite.
> >
> >huh, i thought the action was on me until i checked tracker.
>
> Eric,
>
> I'm getting some warnings from use of:
>
> <http://a.example/AZaz\u00c0\u00d6\u00d8\u00f6\u00f8\u02ff\u0370\u037d\u037f\u1fff\u200c\u200d\u2070\u218f\u2c00\u2fef\u3001\ud7ff\uf900\ufdcf\ufdf0\ufffd\U00010000\U000effff>
>
> as not being Normal Form KC and not Normal Form C (presumably
> different characters causing those two warnings).
>
> (it's not related to the \U characters - I tried without them as well.)
>
> I'm not clear what RDF Concept says here. It's directly stating
> literals are NFC, but any impact on IRIs comes indirectly from "it's
> a legal IRI"
>
> RFC 3987:
> [[ 3.1. Mapping of IRIs to URIs
>
> c. If the IRI is in a Unicode-based character encoding (for
> example, UTF-8 or UTF-16), do not normalize (see section
> 5.3.2.2 for details). Apply step 2 directly to the
> encoded Unicode character sequence.
> ]]
>
> 5.3.2.2 says:
> [[
> To avoid false negatives and problems with
> transcoding, IRIs SHOULD be created by using NFC.
> ]]
>
> so it's a SHOULD in RFC 3987 on creation.
I poked around looking for composing characters in the PN_CHARS_BASE
character ranges. \u02ff MODIFIER LETTER LOW LEFT ARROW seemed like it
could be a culprit, but fileformat.info claims it's not in a combining
class. Likewise \ufffd REPLACEMENT CHARACTER
There are a bunch of yet-unassigned characters which could be confusing
a vigilent IRI checkr. I've mapped those to the highest currently-
assigned characters in their respective range (per fileformat.info):
\u037f 37e
\u1fff 1ffe
\u218f 2189
\u2fef 2fd5
\ud7ff d7fb
\ufdcf fdc7
\U000effff e01ef
attached is a variant of
localName_with_PN_CHARS_BASE_character_boundaries.{nt,ttl}
with the values substituted. (I pass this modified test so there
shouldn't be any typos in it.) If it still doesn't work, try chopping
off the last character 'cause it's a variation selector which ostensibly
is NF{,K}{C,D} valid, but may not have been when jjc wrote your checker.
> Andy
>
>
> >
> >I think we should quickly change the
> > @prefix rdft: <http://www.w3.org/ns/rdftest#> .
> >namespace to
> > <https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/tests-ttl/ns>
> >as it is entirely Turtle-specific (used for the following types:
> > rdft:TestTurtleEval
> > rdft:TestTurtlePositiveSyntax
> > rdft:TestTurtleNegativeSyntax
> > rdft:TestTurtleNegativeEval
> >
> >). I expect that the general notion of an RDF manifest-driven test
> >suite will some day use <http://www.w3.org/ns/rdftest#> but would look
> >like <http://www.w3.org/2001/sw/DataAccess/tests/test-manifest#> .
> >
> >While curating the CR comments, I saw that 21 proposes some additional tests.
> >
> >http://www.w3.org/2011/rdf-wg/wiki/Turtle_Candidate_Recommendation_Comments#c21
> >
>
--
-ericP
Attachments
- text/turtle attachment: localName_with_PN_CHARS_BASE_character_boundaries.ttl
- text/plain attachment: localName_with_PN_CHARS_BASE_character_boundaries.nt
Received on Wednesday, 20 March 2013 14:00:00 UTC