W3C home > Mailing lists > Public > public-lod@w3.org > January 2011

Re: URI Comparisons: RFC 2616 vs. RDF

From: Tim Berners-Lee <timbl@w3.org>
Date: Mon, 17 Jan 2011 19:21:16 +0000
Cc: Martin Hepp <martin.hepp@ebusiness-unibw.org>, public-lod@w3.org
Message-Id: <9AB465A0-32CE-45CD-9B53-3B6CFCBA7087@w3.org>
To: Dave Reynolds <dave.e.reynolds@gmail.com>

On 2011-01 -17, at 16:37, Dave Reynolds wrote:

> On Mon, 2011-01-17 at 16:51 +0100, Martin Hepp wrote: 
>> Dear all:
>> 
>> RFC 2616 [1, section 3.2.3] says that
>> 
>> "When comparing two URIs to decide if they match or not, a client   
>> SHOULD use a case-sensitive octet-by-octet comparison of the entire
>>    URIs, with these exceptions:
>> 
>>       - A port that is empty or not given is equivalent to the default
>>         port for that URI-reference;
>>       - Comparisons of host names MUST be case-insensitive;
>>       - Comparisons of scheme names MUST be case-insensitive;
>>       - An empty abs_path is equivalent to an abs_path of "/".
>> 
>>    Characters other than those in the "reserved" and "unsafe" sets (see
>>    RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.
>> 
>>    For example, the following three URIs are equivalent:
>> 
>>       http://abc.com:80/~smith/home.html
>>       http://ABC.com/%7Esmith/home.html
>>       http://ABC.com:/%7esmith/home.html
>> "
>> 
>> Does this also hold for identifying RDF resources
>> 
>> a) in theory and

Yes this does hold for RDF systems.
You can't guarantee that all RDF systems will do it, so
RDF systems should in general exchange canonicalized URIs.
There is a ladder of levels at  which smarter and smarter systems 
are aware of more and more equivalences. 
Good to make your system smart and not end up
with widow graphs about http://WWW.w3.org/foo.

cwm for example canonicalizes URIs when it loads them into the store.



> 
> No. RDF Concepts defines equality of RDF URI References [1] as simply
> character-by-character equality of the %-encoded UTF-8 Unicode strings.
> 
> Note the final Note in that section:
> 
> """
> Note: Because of the risk of confusion between RDF URI references that
> would be equivalent if derefenced, the use of %-escaped characters in
> RDF URI references is strongly discouraged. 
> """
> 
> which explicitly calls out the difference between URI equivalence
> (dereference to the same resource) and RDF URI Reference equality.
> 
> BTW the more up to date RFC for looking at equivalence (as opposed to
> equality) issues is probably the IRI spec [2] which defines a comparison
> ladder for testing equivalence.

Exactly.

> 
> Dave
> 
> [1]
> http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Graph-URIref
> 
> [2] http://www.ietf.org/rfc/rfc3987.txt
> 
> 
> 
Received on Monday, 17 January 2011 19:21:27 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:31 UTC