Re: IRIs from Michael Kifer on 2007-04-16 (public-rif-wg@w3.org from April 2007)

From: Michael Kifer <kifer@cs.sunysb.edu>
Date: Mon, 16 Apr 2007 15:25:09 -0400
To: Sandro Hawke <sandro@w3.org>
Cc: Jeremy Carroll <jjc@hpl.hp.com>, Dave Reynolds <der@hplb.hpl.hp.com>, Christian de Sainte Marie <csma@ilog.fr>, RIF WG <public-rif-wg@w3.org>
Message-ID: <15537.1176751509@cs.sunysb.edu>

Thanks. I think this answers my question.
My concern was that there might be an IRI, x, such that its encoding as a URI,
f(x), is not equivalent to x *as an IRI*.
You seems to be saying that this is not possible.

In this case I indeed see no reason why we shouldn't be using IRIs.


	--michael  


> > The following might be a naive question due to my inadequate familiarity
> > with RFCs.
> > 
> > Are symbols like ~ allowed in IRIs? My understanding is that only
> > a-z, A-Z, 0-9, ., -, *, and _ are allowed as is and the rest are encoded.
> > So, since ~ is supposed to be encoded, something like
> > 
> >     http://www.cs.sunysb.edu/~kifer/
> 
> Tilde (~) is allowed in URIs.   
> 
>       unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
> 
> [ http://www.ietf.org/rfc/rfc3986.txt ] 
> 
> I'm not quite sure what you're getting at.   If I don't address it
> below, maybe try an example other than "~".
> 
> > Or, am I wrong and the %-encodings mean the same as in IRIs as they do in U=
> > RIs?
> 
> Percent-encodings mean the same things in IRIs an URIs.
> 
> Percent-encodings are one of several things that can potentially
> complicate using URIs as identifiers.  Another is case.  Are these two
> URIs the same?
> 
>      http://www.w3.org
>      http://WWW.W3.ORG 
> 
> The domain name system is defined to be case-insensitive, so in some
> sense those two URIs have to mean the same thing.  But if all Semantic
> Web software was supposed to know all the rules like that, it would be
> crazy.
> 
> RFC 3986 (URIs) says:
> 
> | 6.  Normalization and Comparison
> | 
> |    One of the most common operations on URIs is simple comparison:
> |    determining whether two URIs are equivalent without using the URIs to
> |    access their respective resource(s).  A comparison is performed every
> |    time a response cache is accessed, a browser checks its history to
> |    color a link, or an XML parser processes tags within a namespace.
> |    Extensive normalization prior to comparison of URIs is often used by
> |    spiders and indexing engines to prune a search space or to reduce
> |    duplication of request actions and response storage.
> | 
> |    URI comparison is performed for some particular purpose.  Protocols
> |    or implementations that compare URIs for different purposes will
> |    often be subject to differing design trade-offs in regards to how
> |    much effort should be spent in reducing aliased identifiers.  This
> |    section describes various methods that may be used to compare URIs,
> |    the trade-offs between them, and the types of applications that might
> |    use them.
> 
> 
> It then talks about a "Comparison Ladder" from simple string comparison
> on to more and more sophisticated ways one might be able to tell two
> URIs are equivalent.  In RDF and related specifications, the choice has
> been to stay on the bottom rung and just treat the identifiers as opaque
> strings.
> 
>      -- Sandro
>

Received on Monday, 16 April 2007 19:36:23 UTC