- From: Sandro Hawke <sandro@w3.org>
- Date: Mon, 16 Apr 2007 15:10:03 -0400
- To: kifer@cs.sunysb.edu (Michael Kifer)
- Cc: Jeremy Carroll <jjc@hpl.hp.com>, Dave Reynolds <der@hplb.hpl.hp.com>, Christian de Sainte Marie <csma@ilog.fr>, RIF WG <public-rif-wg@w3.org>
> The following might be a naive question due to my inadequate familiarity > with RFCs. > > Are symbols like ~ allowed in IRIs? My understanding is that only > a-z, A-Z, 0-9, ., -, *, and _ are allowed as is and the rest are encoded. > So, since ~ is supposed to be encoded, something like > > http://www.cs.sunysb.edu/~kifer/ Tilde (~) is allowed in URIs. unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" [ http://www.ietf.org/rfc/rfc3986.txt ] I'm not quite sure what you're getting at. If I don't address it below, maybe try an example other than "~". > Or, am I wrong and the %-encodings mean the same as in IRIs as they do in U= > RIs? Percent-encodings mean the same things in IRIs an URIs. Percent-encodings are one of several things that can potentially complicate using URIs as identifiers. Another is case. Are these two URIs the same? http://www.w3.org http://WWW.W3.ORG The domain name system is defined to be case-insensitive, so in some sense those two URIs have to mean the same thing. But if all Semantic Web software was supposed to know all the rules like that, it would be crazy. RFC 3986 (URIs) says: | 6. Normalization and Comparison | | One of the most common operations on URIs is simple comparison: | determining whether two URIs are equivalent without using the URIs to | access their respective resource(s). A comparison is performed every | time a response cache is accessed, a browser checks its history to | color a link, or an XML parser processes tags within a namespace. | Extensive normalization prior to comparison of URIs is often used by | spiders and indexing engines to prune a search space or to reduce | duplication of request actions and response storage. | | URI comparison is performed for some particular purpose. Protocols | or implementations that compare URIs for different purposes will | often be subject to differing design trade-offs in regards to how | much effort should be spent in reducing aliased identifiers. This | section describes various methods that may be used to compare URIs, | the trade-offs between them, and the types of applications that might | use them. It then talks about a "Comparison Ladder" from simple string comparison on to more and more sophisticated ways one might be able to tell two URIs are equivalent. In RDF and related specifications, the choice has been to stay on the bottom rung and just treat the identifiers as opaque strings. -- Sandro
Received on Monday, 16 April 2007 19:10:33 UTC