Re: IRI text, addendum

Michael Kifer wrote:

>> Not quite. The "no implied equalities" applies to the normalized IRI 
>> string, the thing that ends up in the abstract syntax. We've just spend 
>> some time discussing the many normalization levels that the specs 
>> provide for and confirming that we should stick to the basic level (i.e. 
>> just Normal Form C). It seemed worth spelling that out.
> 
> We didn't talk about normalized IRIs, but IRIs as sequences of characters.

We talked explicitly about the normalization step which generates that 
sequence of characters to be compared from an (absolute or resolved) 
original IRI character sequence. In particular we mentioned and rejected 
the use of percent-encoding normalization (one of the syntactic 
normalization options).

>>> Why not simply say that two URIs are distinct if they are not identical?
>>> Instead, the paragraph invokes the normalization stuff, the unreserved
>>> characters crap, and 3 external references!
>> A week ago the proposed text said basically that for IRIs. There was 
>> some unhappiness about what the implication of selecting IRIs meant and 
>> how they interoperate with URIs. Those implications are totally bound up 
>> in the choice normalization/mapping steps. We've just spent lots of 
>> emails pinning down the answer to that, in part by careful reference to 
>> the RFCs. Yes our end choice is the trivial one, but that wasn't obvious 
>> to everyone a week ago. It seemed to me useful to capture that in actual 
>> text in the spec so that implementers are as clear as we are now and not 
>> as potentially confused as we were a week ago.
> 
> This is precisely my point. The documents are written in a convoluted
> manner, which purports to be precise, but really isn't. It takes several
> people and a lengthy discussion to more or less agree and understand what
> the authors of the document might have meant.

Disagree. The issue was not lack of clarity in the RFC text but the fact 
that the spec provides *options* much of our discussion was implicitly 
to do with justifying our choice amongst those options. In fact the text 
which describes those options and tells us which one is appropriate for 
our usage is very clear, once you are look at the relevant section. 
Telling people we have picked the simple option and explicitly not 
picked the other alternatives, and which is the relevant section of the 
RFC to go look in to understand that, would help rather than hinder.

However, to repeat, I'll drop this for now and return to it once we've 
joined the dots to a concrete syntax.

Dave
-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Received on Thursday, 19 April 2007 10:27:47 UTC