- From: Sandro Hawke <sandro@w3.org>
- Date: Mon, 16 Apr 2007 13:12:57 -0400
- To: Jeremy Carroll <jjc@hpl.hp.com>
- Cc: Michael Kifer <kifer@cs.sunysb.edu>, Dave Reynolds <der@hplb.hpl.hp.com>, Christian de Sainte Marie <csma@ilog.fr>, RIF WG <public-rif-wg@w3.org>
> Yes: IRIs are a superset of URIs. ... > The set of letters used for URIs is a subset of that used for IRIs (and > a small subset!) Agreed. RFC 3987 states simply, "Every URI is by definition an IRI". It's a little confusing, though, because some URIs are the result of mapping a non-URI IRI into a URI, and some are not. Let me give an example. Here's an IRI: (a) http://www.w3.org/International/articles/idn-and-iri/JPǼƦ/°ú¤³ä¤êǼƦ.html If our mailers are all working, it should look like a URI which has some Kanji in it. It's from a test page if you want to check how it appears [1]. (I tested my mailer, and this should at least look correct in our web archives.) Now here is that IRI mapped into a URI, following the process defined by RFC 3987, section 3.1 ("Mapping of IRIs to URIs"): (b) http://www.w3.org/International/articles/idn-and-iri/JP%E7%B4%8D%E8%B1%86/%E5%BC%95%E3%81%8D%E5%89%B2%E3%82%8A%E7%B4%8D%E8%B1%86.html Both (a) and (b) are IRIs, but only (b) is a URI. Note that if you apply the mapping algorithm to (b), you get (b) again, and that there is an inverse mapping algorithm defined to get from (b) to (a). Here's a third URI: (c) http://www.w3.org This URI is, of course, also an IRI. But unlike (b), it wont be changed by applying the inverse mapping. We can think of (c) is a "natural URI" and (b) is a "carrier URI", a URI which exists only to carry an IRI. Users should only be presented with natural URIs and IRIs -- they should never be presented with carrier URIs. So, while carrier URIs are *technically* IRIs already, we talk about converting them into IRIs, which means converting them into their "natural" state. A "natural IRI" then is any IRI which is not a carrier. So, in this sense, lots of (carrier) URIs are not (natural) IRIs. Right? In common usage we don't think of (b) as an IRI; we specifically contrast it with IRIs. Hopefully my natural/carrier terminology makes this clear: - Technically, every URI is an IRI. - But only some URIs (the natural ones) are natural IRIs. All that said: Because RIF is not intended for human consumption, I think we *could* limit it to handling only URIs, knowing that translators will convert to/from IRIs as necessary. However, since RIF will be an XML format, I think it's reasonable to expect and allow for some human consumption. Since XML is already safe for IRIs, it's no additional work. I think RIF should just use IRIs. On the naming question -- do we call them IRIs or say "URI" even though we really mean IRI? -- I note that the SPARQL Last Call draft calls them IRIs [2], but SWEO (the Semantic Web Education and Outreach Interest Group) still seems to call them URIs. I've suggested to its chair that SWEO talk about it with the relevant WGs (including us) If they're willing to switch to IRI in their documents, that should clear the path for us. -- Sandro [1] http://www.w3.org/International/tests/sec-iri-3 [2] http://www.w3.org/TR/2007/WD-rdf-sparql-query-20070326/#QSynIRI
Received on Monday, 16 April 2007 17:13:26 UTC