- From: Nuno Bettencourt <nuno.bett@gmail.com>
- Date: Mon, 17 Jan 2011 17:27:51 -0000
- To: <nathan@webr3.org>, "'Dave Reynolds'" <dave.e.reynolds@gmail.com>, "'Sandro Hawke'" <sandro@w3.org>
- Cc: "'Martin Hepp'" <martin.hepp@ebusiness-unibw.org>, <public-lod@w3.org>
Hi, Even though I'll be deviating the point just a bit, since we're discussing URI comparison in terms of RDF, I would like to request some help. I have a doubt about URLs when it comes to RDF URI comparison. Is there any RFC that establishes if http://abc.com:80/~smith/home.html https://abc.com:80/~smith/home.html or even ftp://abc.com:80/~smith/home.html should or not be considered the same resource? Best regards, Nuno Bettencourt > -----Original Message----- > From: public-lod-request@w3.org [mailto:public-lod-request@w3.org] On > Behalf Of Nathan > Sent: segunda-feira, 17 de Janeiro de 2011 16:53 > To: Dave Reynolds; Sandro Hawke > Cc: Martin Hepp; public-lod@w3.org > Subject: Re: URI Comparisons: RFC 2616 vs. RDF > > Dave Reynolds wrote: > > On Mon, 2011-01-17 at 16:51 +0100, Martin Hepp wrote: > >> Dear all: > >> > >> RFC 2616 [1, section 3.2.3] says that > >> > >> "When comparing two URIs to decide if they match or not, a client > >> SHOULD use a case-sensitive octet-by-octet comparison of the entire > >> URIs, with these exceptions: > >> > >> - A port that is empty or not given is equivalent to the default > >> port for that URI-reference; > >> - Comparisons of host names MUST be case-insensitive; > >> - Comparisons of scheme names MUST be case-insensitive; > >> - An empty abs_path is equivalent to an abs_path of "/". > >> > >> Characters other than those in the "reserved" and "unsafe" sets (see > >> RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding. > >> > >> For example, the following three URIs are equivalent: > >> > >> http://abc.com:80/~smith/home.html > >> http://ABC.com/%7Esmith/home.html > >> http://ABC.com:/%7esmith/home.html > >> " > >> > >> Does this also hold for identifying RDF resources > >> > >> a) in theory and > > > > No. RDF Concepts defines equality of RDF URI References [1] as simply > > character-by-character equality of the %-encoded UTF-8 Unicode strings. > > > > Note the final Note in that section: > > > > """ > > Note: Because of the risk of confusion between RDF URI references that > > would be equivalent if derefenced, the use of %-escaped characters in > > RDF URI references is strongly discouraged. > > """ > > > > which explicitly calls out the difference between URI equivalence > > (dereference to the same resource) and RDF URI Reference equality. > > I'd suggest that it's a little more complex than that, and that this may be an > issue to clear up in the next RDF WG (it's on the charter I believe). > > For example: > > When a URI uses components of the generic syntax, the component > syntax equivalence rules always apply; namely, that the scheme and > host are case-insensitive and therefore should be normalized to > lowercase. For example, the URI <HTTP://www.EXAMPLE.com/> is > equivalent to <http://www.example.com/>. > > - http://tools.ietf.org/html/rfc3986#section-6.2.2.1 > > However, that's only for URIs which use the generic syntax (which most URIs > we ever touch do use). > > It would be great if a normalized-IRI with specific normalization rules could be > drafted up as part of the next WG charter - after all they are a pretty pivotal > part of the sem web setup, and it would be relatively easy to clear up these > issues. > > Best, > > Nathan
Received on Monday, 17 January 2011 17:35:57 UTC