- From: Michael Kifer <kifer@cs.sunysb.edu>
- Date: Mon, 16 Apr 2007 15:25:09 -0400
- To: Sandro Hawke <sandro@w3.org>
- Cc: Jeremy Carroll <jjc@hpl.hp.com>, Dave Reynolds <der@hplb.hpl.hp.com>, Christian de Sainte Marie <csma@ilog.fr>, RIF WG <public-rif-wg@w3.org>
Thanks. I think this answers my question. My concern was that there might be an IRI, x, such that its encoding as a URI, f(x), is not equivalent to x *as an IRI*. You seems to be saying that this is not possible. In this case I indeed see no reason why we shouldn't be using IRIs. --michael > > The following might be a naive question due to my inadequate familiarity > > with RFCs. > > > > Are symbols like ~ allowed in IRIs? My understanding is that only > > a-z, A-Z, 0-9, ., -, *, and _ are allowed as is and the rest are encoded. > > So, since ~ is supposed to be encoded, something like > > > > http://www.cs.sunysb.edu/~kifer/ > > Tilde (~) is allowed in URIs. > > unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" > > [ http://www.ietf.org/rfc/rfc3986.txt ] > > I'm not quite sure what you're getting at. If I don't address it > below, maybe try an example other than "~". > > > Or, am I wrong and the %-encodings mean the same as in IRIs as they do in U= > > RIs? > > Percent-encodings mean the same things in IRIs an URIs. > > Percent-encodings are one of several things that can potentially > complicate using URIs as identifiers. Another is case. Are these two > URIs the same? > > http://www.w3.org > http://WWW.W3.ORG > > The domain name system is defined to be case-insensitive, so in some > sense those two URIs have to mean the same thing. But if all Semantic > Web software was supposed to know all the rules like that, it would be > crazy. > > RFC 3986 (URIs) says: > > | 6. Normalization and Comparison > | > | One of the most common operations on URIs is simple comparison: > | determining whether two URIs are equivalent without using the URIs to > | access their respective resource(s). A comparison is performed every > | time a response cache is accessed, a browser checks its history to > | color a link, or an XML parser processes tags within a namespace. > | Extensive normalization prior to comparison of URIs is often used by > | spiders and indexing engines to prune a search space or to reduce > | duplication of request actions and response storage. > | > | URI comparison is performed for some particular purpose. Protocols > | or implementations that compare URIs for different purposes will > | often be subject to differing design trade-offs in regards to how > | much effort should be spent in reducing aliased identifiers. This > | section describes various methods that may be used to compare URIs, > | the trade-offs between them, and the types of applications that might > | use them. > > > It then talks about a "Comparison Ladder" from simple string comparison > on to more and more sophisticated ways one might be able to tell two > URIs are equivalent. In RDF and related specifications, the choice has > been to stay on the bottom rung and just treat the identifiers as opaque > strings. > > -- Sandro >
Received on Monday, 16 April 2007 19:36:23 UTC