Some unicode characters are called "combining characters". NCCHAR permits some of these, eg. [#x0300-#x036F] : NCCHAR ::= NCCHAR1 | '_' | '-' | "." | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040] At issue is whether RDF data may have both of these predicates: HR:resumé (normalized EACUTE) HR:resumé(latin 'e' with COMBINING ACUTE ACCENT) Some time between Jan [200301] and Sep [200309], the RDF Core WG dropped the following text from the RDF URI References definition: [[ [A URI reference] is in Normal Form C [NFC] and .. Note: RDF URI references are compatible with the anyURI datatype as defined by XML schema datatypes [XML-SCHEMA2], constrained to be an absolute rather than a relative URI reference, and constrained to be in Unicode Normal Form C [NFC] (for compatibility with [CHARMOD]). ]] The reason [why] given is: [[ Given changes in advice from I18N, we deleted the normalization form C constraint from RDF URI references definition. ]] The spec refrences RFC2396 but not RFC3987 (published later, in Jan 2005). [RFC3987] has this to say about normalization: [[ a. If the IRI is written on paper, read aloud, or otherwise represented as a sequence of characters independent of any character encoding, represent the IRI as a sequence of characters from the UCS normalized according to Normalization Form C (NFC, [UTR15]). ]] If RDF data is s'posed to be normalized, then we should do the same with SPARQL Query. Still researching. Relevent test case: http://www.w3.org/2001/sw/DataAccess/tests/data/i18n/normalization-01.rq on http://www.w3.org/2001/sw/DataAccess/tests/data/i18n/normalization.ttl [200301] http://www.w3.org/TR/2003/WD-rdf-concepts-20030123/#dfn-URI-reference [200309] http://www.w3.org/TR/2003/WD-rdf-concepts-20030905/#dfn-URI-reference [why] http://www.w3.org/TR/2003/WD-rdf-concepts-20030905/#section-substantive-Revisions [RFC3987] http://www.ietf.org/rfc/rfc3987.txt -- -eric office: +81.466.49.1170 W3C, Keio Research Institute at SFC, Shonan Fujisawa Campus, Keio University, 5322 Endo, Fujisawa, Kanagawa 252-8520 JAPAN +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA cell: +81.90.6533.3882 (eric@w3.org) Feel free to forward this message to any list for any purpose other than email address distribution.Received on Tuesday, 28 June 2005 10:37:23 GMT
This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:23 GMT