Re: IRIs from Jeremy Carroll on 2007-04-17 (public-rif-wg@w3.org from April 2007)

From: Jeremy Carroll <jjc@hpl.hp.com>
Date: Tue, 17 Apr 2007 12:15:24 +0100
To: Michael Kifer <kifer@cs.sunysb.edu>
CC: Dave Reynolds <der@hplb.hpl.hp.com>, Sandro Hawke <sandro@w3.org>, Christian de Sainte Marie <csma@ilog.fr>, RIF WG <public-rif-wg@w3.org>
Message-ID: <4624AC4C.3020909@hpl.hp.com>

Michael Kifer wrote:
>> Michael Kifer wrote:
>>> Thanks. I think this answers my question.
>>> My concern was that there might be an IRI, x, such that its encoding as a URI,
>>> f(x), is not equivalent to x *as an IRI*.
>>> You seems to be saying that this is not possible.
>> Sandro's Kanji example illustrates that this is possible. If an IRI i 
>> isn't itself a URI then the URI encoding of it must be different. Unless 
>> you specify some normalization f(i) and i are different.
> 
> Of course they are different. I was talking of them being *equivalent*.
> 
> We may or may not want to introduce an equality relation on such equivalent
> IRIs (note: equality != identity), but regardless of that if
> a ~ b as URIs and not as IRIs then using IRIs as an extension of URIs would
> be problematic. If this doesn't happen then I see no problem.
> 


The usual operation to use is 'character by character comparison'

This has the following effects, which are well suited to using IRIs or 
URIs as identifiers, and under which URIs are a true subset of IRIs.

a) If X and Y compare as equivalent as URIs then they compare as 
equivalent as IRIs

b) If X and Y compare as equivalent as IRIs, and X is a URI then so is 
Y, and they compare as equivalent as URIs

character by character comparison is done after having dealt with low 
level details like character encoding which are input/output issues 
dealt with by XML (for example)

Jeremy






-- 
Hewlett-Packard Limited
registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Received on Tuesday, 17 April 2007 11:15:48 UTC