W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > March 2002

charmod-uri: Proposed question

From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
Date: Mon, 4 Mar 2002 11:34:58 -0000
To: <w3c-rdfcore-wg@w3.org>
Message-ID: <CEECKEAMDAJDDEDGJNBEMEBNCAAA.jjc@hpl.hp.com>


At the f2f we identified that there is a one node or two node test case.


Here is the test:


<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:eg="http://example.org/"
         xml:base="http://example.org/italiano">

  <rdf:Description rdf:about="http://example.org/italiano#%C3%A8">
    <eg:first/>
  </rdf:Description>
  <rdf:Description rdf:ID="">
    <eg:second/>
  </rdf:Description>
</rdf:RDF>


This results in a graph with two edges.
The first edge is labelled <eg:first>, the second <eg:second>

Is the subject of the first edge the same node as the subject of the second
edge.

Equivalently, is the above RDF/XML the same RDF graph as:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:eg="http://example.org/"
         xml:base="http://example.org/italiano">

  <rdf:Description rdf:about="http://example.org/italiano#%C3%A8">
    <eg:first/>
    <eg:second/>
  </rdf:Description>
</rdf:RDF>


===

I suggest the following message:

[[[
The RDF Core WG would like feedback from both the RDF IG and I18N IG about
internationalised URIs.

Within RDF/XML it is possible to create character sequences that are treated
as original character sequences of URIs that involve characters outside
US-ASCII.

RFC 2396, (with the assumption that the character encoding is UTF-8)
provides for a mapping of such character sequences to US-ASCII using the %
encoding algorithm.

Should/do RDF processors perform the %-encoding while reading the file?
Or should the RDF processor use the original character sequence?

This choice impacts the RDF graph corresponding to RDF/XML files in which
both the original character sequence and the % encoded form of the same URI
are used.

Two implementators in the WG report that their implementations do %
encoding, and forget the orginal character sequence.
Early feedback has pointed out that this is inconsistent with treatment in
some other specs (which only do % encoding when getting a resource and test
for URI equality on the original character seqeunce). The same feedback
pointed out that this is inconsistent with the universal practice of not
doing any other URI normalization (e.g. http://www.w3.org ,
http://www.W3.org and http://www.w3.org:80 are all treated as distinct URIs
by all? RDF processors despite being provably functionally equivalent).

For clarity we ask for a yes/no response to the following test case.

We particularly seek feedback from anyone actually using internationalized
URIs in RDF.

An analysis of what M&S says is found at:

http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Mar/0012.html

Consider the RDF/XML

]]]

Then the test case with its associated text.



My message with non-NFC examples may also be relevant, and we may wish to
indicate that somehow.
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Mar/0027.html

Jeremy
Received on Monday, 4 March 2002 06:35:21 EST

This archive was generated by hypermail pre-2.1.9 : Wednesday, 3 September 2003 09:46:13 EDT