- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Wed, 20 Mar 2013 09:59:30 -0400
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: public-rdf-wg@w3.org
- Message-ID: <20130320135924.GC9440@w3.org>
* Andy Seaborne <andy.seaborne@epimorphics.com> [2013-03-20 10:31+0000] > > > On 20/03/13 07:12, Eric Prud'hommeaux wrote: > >I pushed the ~100 atomic tests into the test-ttl test suite. > > > >huh, i thought the action was on me until i checked tracker. > > Eric, > > I'm getting some warnings from use of: > > <http://a.example/AZaz\u00c0\u00d6\u00d8\u00f6\u00f8\u02ff\u0370\u037d\u037f\u1fff\u200c\u200d\u2070\u218f\u2c00\u2fef\u3001\ud7ff\uf900\ufdcf\ufdf0\ufffd\U00010000\U000effff> > > as not being Normal Form KC and not Normal Form C (presumably > different characters causing those two warnings). > > (it's not related to the \U characters - I tried without them as well.) > > I'm not clear what RDF Concept says here. It's directly stating > literals are NFC, but any impact on IRIs comes indirectly from "it's > a legal IRI" > > RFC 3987: > [[ 3.1. Mapping of IRIs to URIs > > c. If the IRI is in a Unicode-based character encoding (for > example, UTF-8 or UTF-16), do not normalize (see section > 5.3.2.2 for details). Apply step 2 directly to the > encoded Unicode character sequence. > ]] > > 5.3.2.2 says: > [[ > To avoid false negatives and problems with > transcoding, IRIs SHOULD be created by using NFC. > ]] > > so it's a SHOULD in RFC 3987 on creation. I poked around looking for composing characters in the PN_CHARS_BASE character ranges. \u02ff MODIFIER LETTER LOW LEFT ARROW seemed like it could be a culprit, but fileformat.info claims it's not in a combining class. Likewise \ufffd REPLACEMENT CHARACTER There are a bunch of yet-unassigned characters which could be confusing a vigilent IRI checkr. I've mapped those to the highest currently- assigned characters in their respective range (per fileformat.info): \u037f 37e \u1fff 1ffe \u218f 2189 \u2fef 2fd5 \ud7ff d7fb \ufdcf fdc7 \U000effff e01ef attached is a variant of localName_with_PN_CHARS_BASE_character_boundaries.{nt,ttl} with the values substituted. (I pass this modified test so there shouldn't be any typos in it.) If it still doesn't work, try chopping off the last character 'cause it's a variation selector which ostensibly is NF{,K}{C,D} valid, but may not have been when jjc wrote your checker. > Andy > > > > > >I think we should quickly change the > > @prefix rdft: <http://www.w3.org/ns/rdftest#> . > >namespace to > > <https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/tests-ttl/ns> > >as it is entirely Turtle-specific (used for the following types: > > rdft:TestTurtleEval > > rdft:TestTurtlePositiveSyntax > > rdft:TestTurtleNegativeSyntax > > rdft:TestTurtleNegativeEval > > > >). I expect that the general notion of an RDF manifest-driven test > >suite will some day use <http://www.w3.org/ns/rdftest#> but would look > >like <http://www.w3.org/2001/sw/DataAccess/tests/test-manifest#> . > > > >While curating the CR comments, I saw that 21 proposes some additional tests. > > > >http://www.w3.org/2011/rdf-wg/wiki/Turtle_Candidate_Recommendation_Comments#c21 > > > -- -ericP
Attachments
- text/turtle attachment: localName_with_PN_CHARS_BASE_character_boundaries.ttl
- text/plain attachment: localName_with_PN_CHARS_BASE_character_boundaries.nt
Received on Wednesday, 20 March 2013 14:00:00 UTC