- From: Dave Reynolds <dave.e.reynolds@gmail.com>
- Date: Mon, 17 Jan 2011 18:07:56 +0000
- To: nathan@webr3.org
- Cc: public-lod@w3.org
On Mon, 2011-01-17 at 16:52 +0000, Nathan wrote: > Dave Reynolds wrote: > > On Mon, 2011-01-17 at 16:51 +0100, Martin Hepp wrote: > >> Dear all: > >> > >> RFC 2616 [1, section 3.2.3] says that > >> > >> "When comparing two URIs to decide if they match or not, a client > >> SHOULD use a case-sensitive octet-by-octet comparison of the entire > >> URIs, with these exceptions: > >> > >> - A port that is empty or not given is equivalent to the default > >> port for that URI-reference; > >> - Comparisons of host names MUST be case-insensitive; > >> - Comparisons of scheme names MUST be case-insensitive; > >> - An empty abs_path is equivalent to an abs_path of "/". > >> > >> Characters other than those in the "reserved" and "unsafe" sets (see > >> RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding. > >> > >> For example, the following three URIs are equivalent: > >> > >> http://abc.com:80/~smith/home.html > >> http://ABC.com/%7Esmith/home.html > >> http://ABC.com:/%7esmith/home.html > >> " > >> > >> Does this also hold for identifying RDF resources > >> > >> a) in theory and > > > > No. RDF Concepts defines equality of RDF URI References [1] as simply > > character-by-character equality of the %-encoded UTF-8 Unicode strings. > > > > Note the final Note in that section: > > > > """ > > Note: Because of the risk of confusion between RDF URI references that > > would be equivalent if derefenced, the use of %-escaped characters in > > RDF URI references is strongly discouraged. > > """ > > > > which explicitly calls out the difference between URI equivalence > > (dereference to the same resource) and RDF URI Reference equality. > > I'd suggest that it's a little more complex than that, and that this may > be an issue to clear up in the next RDF WG (it's on the charter I believe). I beg to differ. The charter does state: "Clarify the usage of IRI references for RDF resources, e.g., per SPARQL Query ยง1.2.4." However, I was under the impression that was simply removing the small difference between "RDF URI References" and the IRI spec (that they had anticipated). Specifically I thought the only substantive issue there was the treatment of space and many RDF processors already take the conservation position on that anyway. Replacing encoded string equality by deference-equivalence would be a pretty big change to RDF and I hadn't realized that was being considered. Could one of the nominated chairs or a W3C rep clarify this? > For example: > > When a URI uses components of the generic syntax, the component > syntax equivalence rules always apply; namely, that the scheme and > host are case-insensitive and therefore should be normalized to > lowercase. For example, the URI <HTTP://www.EXAMPLE.com/> is > equivalent to <http://www.example.com/>. > > - http://tools.ietf.org/html/rfc3986#section-6.2.2.1 Sure but the later RDF-related specs such as GRDDL and RIF clarify the application of that in RDF. For example in RIF [1] we said: "Neither Syntax-Based Normalization nor Scheme-Based Normalization (described in Sections 6.2.2 and 6.2.3 of RFC-3986) are performed." A form of words that, I think, we lifted verbatim from GRDDL which in turn had chosen them to clarify how the original RDF URI References spec should be interpreted in the light of the updated URI/IRI RFCs. Changing RDF to require syntax or scheme based normalization would require changing at least RIF and GRDDL as well. If that was really on the cards I would have expected it to have been more broadly publicized. Dave [1] http://www.w3.org/TR/2010/PR-rif-dtb-20100511/#Relative_IRIs
Received on Monday, 17 January 2011 18:08:43 UTC