Re: [Admin] Agenda for RIF telecon 17 April

Chris Welty wrote:

> We talked about the issue in the CG today.  The general sense was that 
> going with IRIs is the W3C way to go, however no one present would admit 
> to understanding what the impact is.
> 
> A few notes:
> 
> Dave Reynolds wrote:
>> Is there some specific objection to specifying IRIs?
> 
> Nothing technical at all.  The objection was not to IRIs per se, but to 
> making a decision without understanding the consequences.  I think the 
> only thing we're talking about wrt consequences here are related to 
> perception and adoption (of RIF).  If we insist on IRIs, and no one in 
> our perceived customer set actually uses them, we may put people off.
> 
>> My reasoning in favour of them are:
>>
>> 1. They are a superset of URIs and specifying the superset seems like 
>> the safe default course. If someone especially wanted a dialect with 
>> syntactic restriction to URIs then they could add that restriction in 
>> the dialect.
> 
> IRIs are not a superset of URIs - this would imply that all URIs are 
> IRIs, which is simply not the case. 

Is it not?

Both URIs are IRIs are defined over characters, not octet sequences and 
RFC3897 section 2.1 says:

[[[
   IRIs are defined similarly to URIs in [RFC3986], but the class of
   unreserved characters is extended by adding the characters of the UCS
   (Universal Character Set, [ISO10646]) beyond U+007F, subject to the
   limitations given in the syntax rules below and in section 6.1.

   Otherwise, the syntax and use of components and reserved characters
   is the same as that in [RFC3986].
]]]

The last sentence being the key one.

It is true that RFC3987 section 3.2 includes an algorithm for converting 
a URI to an "equivalent" IRI but all that does is replace the % 
encodings by the corresponding UCS characters where possible. This gives 
a way to invert a IRI->URI mapping (where possible). However, those 
original % encodings are still legal in IRIs, I had thought from looking 
at the specs that the unmapped URI was still a syntactically legal IRI. 
Is that not the case?

I would agree that one SHOULD use that mapping in order to interoperate 
reliably but that's a slightly different issue.

>> 2. For any translator working with an ascii only language they can use 
>> the algorithm in rfc3987 section 3.1 to map to a URI form internally.
>  >
>> 3. This is more compatible with the existing semantic web stack. The 
>> RDF Spec used the term "RDF URI Reference" because it pre-dated 
>> RFC3987 but the definition is basically an IRI [*]. SPARQL specifies 
>> IRIs (and is where I lifted bits of the draft text from) and they have 
>> had no adverse feedback on this choice, it was not seen as in any way 
>> contentious.
> 
> On the contrary!  It is an IMMENSE source of confusion! 

You seem to be reacting to something other than what I said in that quote.

My sentence was specifically referring to the experience with the SPARQL 
spec, not in general. I asked Andy, the SPARQL editor, whether the 
choice of IRIs was contentious within the SPARQL working group or 
whether they had had any adverse comment on that choice in the public 
feedback on the working drafts or last call and he said not. That seemed 
a relevant data point so I was reporting that.

I accept your assertion that they are a source of confusion or concern 
in other quarters.

> As we are 
> seeing here.  Now that we have a IETF RFC for IRIs, I do think the 
> confusion will start to dwindle, but we are not there yet.
> 
>> 4. This is the I18N friendly choice. Apart from any general arguments 
>> as to why I18N might be relevant, the W3C process requires that our 
>> spec be reviewed by the I18N WG and I for one wouldn't want to try to 
>> explain to them why we deliberately avoided IRIs unless we had a good 
>> reason.
> 
> We have not proposed avoiding IRIs, we are considering whether the spec 
> should "require" them, in a sense, or make them an alternative. I see 
> the choices as having "URI or IRI" in the spec vs. just "IRI".

Having both really would be confusing IMHO.

Dave
-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Received on Saturday, 14 April 2007 17:32:43 UTC