- From: Michael Kifer <kifer@cs.sunysb.edu>
- Date: Mon, 16 Apr 2007 14:18:19 -0400
- To: Sandro Hawke <sandro@w3.org>
- Cc: Jeremy Carroll <jjc@hpl.hp.com>, Dave Reynolds <der@hplb.hpl.hp.com>, Christian de Sainte Marie <csma@ilog.fr>, RIF WG <public-rif-wg@w3.org>
The following might be a naive question due to my inadequate familiarity
with RFCs.
Are symbols like ~ allowed in IRIs? My understanding is that only
a-z, A-Z, 0-9, ., -, *, and _ are allowed as is and the rest are encoded.
So, since ~ is supposed to be encoded, something like
http://www.cs.sunysb.edu/~kifer/
is not a URI; it has to be encoded as
http://www.cs.sunysb.edu/%7Ekifer/
If the above is correct (i.e., if http://www.cs.sunysb.edu/~kifer/~kifer is
an IRI but not a URI), then I see a problem.
If I write http://www.cs.sunysb.edu/%7Ekifer/ then as a URI it really
represents http://www.cs.sunysb.edu/~kifer/, but as an IRI it is different from
http://www.cs.sunysb.edu/~kifer/.
Or, am I wrong and the %-encodings mean the same as in IRIs as they do in URIs?
--michael
> > Yes: IRIs are a superset of URIs.
> ...
> > The set of letters used for URIs is a subset of that used for IRIs (and
> > a small subset!)
>
> Agreed. RFC 3987 states simply, "Every URI is by definition an IRI".
>
> It's a little confusing, though, because some URIs are the result of
> mapping a non-URI IRI into a URI, and some are not.
>
> Let me give an example. Here's an IRI:
>
> (a) http://www.w3.org/International/articles/idn-and-iri/JPǼƦ/°ú¤³ä¤êǼƦ.html
>
> If our mailers are all working, it should look like a URI which has some
> Kanji in it. It's from a test page if you want to check how it appears
> [1]. (I tested my mailer, and this should at least look correct in our
> web archives.)
>
> Now here is that IRI mapped into a URI, following the process defined
> by RFC 3987, section 3.1 ("Mapping of IRIs to URIs"):
>
> (b) http://www.w3.org/International/articles/idn-and-iri/JP%E7%B4%8D%E8%B1%86/%E5%BC%95%E3%81%8D%E5%89%B2%E3%82%8A%E7%B4%8D%E8%B1%86.html
>
> Both (a) and (b) are IRIs, but only (b) is a URI. Note that if you
> apply the mapping algorithm to (b), you get (b) again, and that there is
> an inverse mapping algorithm defined to get from (b) to (a).
>
> Here's a third URI:
>
> (c) http://www.w3.org
>
> This URI is, of course, also an IRI. But unlike (b), it wont be changed
> by applying the inverse mapping. We can think of (c) is a "natural URI"
> and (b) is a "carrier URI", a URI which exists only to carry an IRI.
> Users should only be presented with natural URIs and IRIs -- they should
> never be presented with carrier URIs. So, while carrier URIs are
> *technically* IRIs already, we talk about converting them into IRIs,
> which means converting them into their "natural" state. A "natural IRI"
> then is any IRI which is not a carrier.
>
> So, in this sense, lots of (carrier) URIs are not (natural) IRIs.
> Right? In common usage we don't think of (b) as an IRI; we specifically
> contrast it with IRIs. Hopefully my natural/carrier terminology makes
> this clear:
>
> - Technically, every URI is an IRI.
> - But only some URIs (the natural ones) are natural IRIs.
>
> All that said:
>
> Because RIF is not intended for human consumption, I think we *could*
> limit it to handling only URIs, knowing that translators will convert
> to/from IRIs as necessary. However, since RIF will be an XML format, I
> think it's reasonable to expect and allow for some human consumption.
> Since XML is already safe for IRIs, it's no additional work. I think
> RIF should just use IRIs.
>
> On the naming question -- do we call them IRIs or say "URI" even though
> we really mean IRI? -- I note that the SPARQL Last Call draft calls them
> IRIs [2], but SWEO (the Semantic Web Education and Outreach Interest
> Group) still seems to call them URIs. I've suggested to its chair that
> SWEO talk about it with the relevant WGs (including us) If they're
> willing to switch to IRI in their documents, that should clear the path
> for us.
>
> -- Sandro
>
> [1] http://www.w3.org/International/tests/sec-iri-3
> [2] http://www.w3.org/TR/2007/WD-rdf-sparql-query-20070326/#QSynIRI
>
>
Received on Monday, 16 April 2007 18:30:55 UTC