- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 28 Jun 2005 06:37:20 -0400
- To: public-rdf-dawg@w3.org
- Message-ID: <20050628103720.GB15269@w3.org>
Some unicode characters are called "combining characters". NCCHAR permits some of these, eg. [#x0300-#x036F] : NCCHAR ::= NCCHAR1 | '_' | '-' | "." | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040] At issue is whether RDF data may have both of these predicates: HR:resumé (normalized EACUTE) HR:resumé(latin 'e' with COMBINING ACUTE ACCENT) Some time between Jan [200301] and Sep [200309], the RDF Core WG dropped the following text from the RDF URI References definition: [[ [A URI reference] is in Normal Form C [NFC] and .. Note: RDF URI references are compatible with the anyURI datatype as defined by XML schema datatypes [XML-SCHEMA2], constrained to be an absolute rather than a relative URI reference, and constrained to be in Unicode Normal Form C [NFC] (for compatibility with [CHARMOD]). ]] The reason [why] given is: [[ Given changes in advice from I18N, we deleted the normalization form C constraint from RDF URI references definition. ]] The spec refrences RFC2396 but not RFC3987 (published later, in Jan 2005). [RFC3987] has this to say about normalization: [[ a. If the IRI is written on paper, read aloud, or otherwise represented as a sequence of characters independent of any character encoding, represent the IRI as a sequence of characters from the UCS normalized according to Normalization Form C (NFC, [UTR15]). ]] If RDF data is s'posed to be normalized, then we should do the same with SPARQL Query. Still researching. Relevent test case: http://www.w3.org/2001/sw/DataAccess/tests/data/i18n/normalization-01.rq on http://www.w3.org/2001/sw/DataAccess/tests/data/i18n/normalization.ttl [200301] http://www.w3.org/TR/2003/WD-rdf-concepts-20030123/#dfn-URI-reference [200309] http://www.w3.org/TR/2003/WD-rdf-concepts-20030905/#dfn-URI-reference [why] http://www.w3.org/TR/2003/WD-rdf-concepts-20030905/#section-substantive-Revisions [RFC3987] http://www.ietf.org/rfc/rfc3987.txt -- -eric office: +81.466.49.1170 W3C, Keio Research Institute at SFC, Shonan Fujisawa Campus, Keio University, 5322 Endo, Fujisawa, Kanagawa 252-8520 JAPAN +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA cell: +81.90.6533.3882 (eric@w3.org) Feel free to forward this message to any list for any purpose other than email address distribution.
Received on Tuesday, 28 June 2005 10:37:23 UTC