W3C home > Mailing lists > Public > public-lod@w3.org > January 2011

Re: URI Comparisons: RFC 2616 vs. RDF

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Mon, 17 Jan 2011 13:26:16 -0500
Message-ID: <4D3489C8.1090501@openlinksw.com>
To: Nuno Bettencourt <nuno.bett@gmail.com>
CC: nathan@webr3.org, 'Dave Reynolds' <dave.e.reynolds@gmail.com>, 'Sandro Hawke' <sandro@w3.org>, 'Martin Hepp' <martin.hepp@ebusiness-unibw.org>, public-lod@w3.org
On 1/17/11 12:27 PM, Nuno Bettencourt wrote:
> Hi,
> Even though I'll be deviating the point just a bit, since we're discussing URI comparison in terms of RDF, I would like to request some help.
> I have a doubt about URLs when it comes to RDF URI comparison. Is there any RFC that establishes if
> http://abc.com:80/~smith/home.html
> https://abc.com:80/~smith/home.html
> or even
> ftp://abc.com:80/~smith/home.html
> should or not be considered the same resource?

All of the above are Addresses (based on what I can infer via my visual 
senses). The URI abstraction enables multiple scheme data access. "ftp:" 
and "http:" are schemes. None of them isA resource. They simply provide 
access to data why may be serialized in a variety of formats to a user 
agent that de-references any of these Addresses. Basically, network 
aware pointers with data representation dexterity courtesy of URI 
abstraction and HTTP's content negotiation.


> Best regards,
> Nuno Bettencourt
>> -----Original Message-----
>> From: public-lod-request@w3.org [mailto:public-lod-request@w3.org] On
>> Behalf Of Nathan
>> Sent: segunda-feira, 17 de Janeiro de 2011 16:53
>> To: Dave Reynolds; Sandro Hawke
>> Cc: Martin Hepp; public-lod@w3.org
>> Subject: Re: URI Comparisons: RFC 2616 vs. RDF
>> Dave Reynolds wrote:
>>> On Mon, 2011-01-17 at 16:51 +0100, Martin Hepp wrote:
>>>> Dear all:
>>>> RFC 2616 [1, section 3.2.3] says that
>>>> "When comparing two URIs to decide if they match or not, a client
>>>> SHOULD use a case-sensitive octet-by-octet comparison of the entire
>>>>      URIs, with these exceptions:
>>>>         - A port that is empty or not given is equivalent to the default
>>>>           port for that URI-reference;
>>>>         - Comparisons of host names MUST be case-insensitive;
>>>>         - Comparisons of scheme names MUST be case-insensitive;
>>>>         - An empty abs_path is equivalent to an abs_path of "/".
>>>>      Characters other than those in the "reserved" and "unsafe" sets (see
>>>>      RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.
>>>>      For example, the following three URIs are equivalent:
>>>>         http://abc.com:80/~smith/home.html
>>>>         http://ABC.com/%7Esmith/home.html
>>>>         http://ABC.com:/%7esmith/home.html
>>>> "
>>>> Does this also hold for identifying RDF resources
>>>> a) in theory and
>>> No. RDF Concepts defines equality of RDF URI References [1] as simply
>>> character-by-character equality of the %-encoded UTF-8 Unicode strings.
>>> Note the final Note in that section:
>>> """
>>> Note: Because of the risk of confusion between RDF URI references that
>>> would be equivalent if derefenced, the use of %-escaped characters in
>>> RDF URI references is strongly discouraged.
>>> """
>>> which explicitly calls out the difference between URI equivalence
>>> (dereference to the same resource) and RDF URI Reference equality.
>> I'd suggest that it's a little more complex than that, and that this may be an
>> issue to clear up in the next RDF WG (it's on the charter I believe).
>> For example:
>>      When a URI uses components of the generic syntax, the component
>>      syntax equivalence rules always apply; namely, that the scheme and
>>      host are case-insensitive and therefore should be normalized to
>>      lowercase.  For example, the URI<HTTP://www.EXAMPLE.com/>  is
>>      equivalent to<http://www.example.com/>.
>> - http://tools.ietf.org/html/rfc3986#section-
>> However, that's only for URIs which use the generic syntax (which most URIs
>> we ever touch do use).
>> It would be great if a normalized-IRI with specific normalization rules could be
>> drafted up as part of the next WG charter - after all they are a pretty pivotal
>> part of the sem web setup, and it would be relatively easy to clear up these
>> issues.
>> Best,
>> Nathan



Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Monday, 17 January 2011 18:26:47 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:11 UTC