- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 28 Jun 2005 06:37:20 -0400
- To: public-rdf-dawg@w3.org
- Message-ID: <20050628103720.GB15269@w3.org>
Some unicode characters are called "combining characters". NCCHAR
permits some of these, eg. [#x0300-#x036F] :
NCCHAR ::= NCCHAR1 | '_' | '-' | "." | [0-9] | #x00B7
| [#x0300-#x036F] | [#x203F-#x2040]
At issue is whether RDF data may have both of these predicates:
HR:resumé (normalized EACUTE)
HR:resumé(latin 'e' with COMBINING ACUTE ACCENT)
Some time between Jan [200301] and Sep [200309], the RDF Core WG
dropped the following text from the RDF URI References definition:
[[
[A URI reference] is in Normal Form C [NFC] and
..
Note: RDF URI references are compatible with the anyURI datatype as
defined by XML schema datatypes [XML-SCHEMA2], constrained to be an
absolute rather than a relative URI reference, and constrained to be
in Unicode Normal Form C [NFC] (for compatibility with [CHARMOD]).
]]
The reason [why] given is:
[[
Given changes in advice from I18N, we deleted the normalization form
C constraint from RDF URI references definition.
]]
The spec refrences RFC2396 but not RFC3987 (published later, in Jan
2005). [RFC3987] has this to say about normalization:
[[
a. If the IRI is written on paper, read aloud, or otherwise
represented as a sequence of characters independent of any character
encoding, represent the IRI as a sequence of characters from the UCS
normalized according to Normalization Form C (NFC, [UTR15]).
]]
If RDF data is s'posed to be normalized, then we should do the same
with SPARQL Query. Still researching. Relevent test case:
http://www.w3.org/2001/sw/DataAccess/tests/data/i18n/normalization-01.rq
on
http://www.w3.org/2001/sw/DataAccess/tests/data/i18n/normalization.ttl
[200301] http://www.w3.org/TR/2003/WD-rdf-concepts-20030123/#dfn-URI-reference
[200309] http://www.w3.org/TR/2003/WD-rdf-concepts-20030905/#dfn-URI-reference
[why] http://www.w3.org/TR/2003/WD-rdf-concepts-20030905/#section-substantive-Revisions
[RFC3987] http://www.ietf.org/rfc/rfc3987.txt
--
-eric
office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
Shonan Fujisawa Campus, Keio University,
5322 Endo, Fujisawa, Kanagawa 252-8520
JAPAN
+1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell: +81.90.6533.3882
(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Tuesday, 28 June 2005 10:37:23 UTC