Re: IRI text, addendum from Michael Kifer on 2007-04-19 (public-rif-wg@w3.org from April 2007)

From: Michael Kifer <kifer@cs.sunysb.edu>
Date: Thu, 19 Apr 2007 07:04:35 -0400
To: Dave Reynolds <der@hplb.hpl.hp.com>
Cc: RIF <public-rif-wg@w3.org>
Message-ID: <25108.1176980675@cs.sunysb.edu>
> Michael Kifer wrote:
> 
> >> Not quite. The "no implied equalities" applies to the normalized IRI 
> >> string, the thing that ends up in the abstract syntax. We've just spend 
> >> some time discussing the many normalization levels that the specs 
> >> provide for and confirming that we should stick to the basic level (i.e. 
> >> just Normal Form C). It seemed worth spelling that out.
> > 
> > We didn't talk about normalized IRIs, but IRIs as sequences of characters.
> 
> We talked explicitly about the normalization step which generates that 
> sequence of characters to be compared from an (absolute or resolved) 
> original IRI character sequence. In particular we mentioned and rejected 
> the use of percent-encoding normalization (one of the syntactic 
> normalization options).

Sorry for the lack of clarity here. By "we didn't talk about ..." I meant
"we in the RIF WD1."

> >>> Why not simply say that two URIs are distinct if they are not identical?
> >>> Instead, the paragraph invokes the normalization stuff, the unreserved
> >>> characters crap, and 3 external references!
> >> A week ago the proposed text said basically that for IRIs. There was 
> >> some unhappiness about what the implication of selecting IRIs meant and 
> >> how they interoperate with URIs. Those implications are totally bound up 
> >> in the choice normalization/mapping steps. We've just spent lots of 
> >> emails pinning down the answer to that, in part by careful reference to 
> >> the RFCs. Yes our end choice is the trivial one, but that wasn't obvious 
> >> to everyone a week ago. It seemed to me useful to capture that in actual 
> >> text in the spec so that implementers are as clear as we are now and not 
> >> as potentially confused as we were a week ago.
> > 
> > This is precisely my point. The documents are written in a convoluted
> > manner, which purports to be precise, but really isn't. It takes several
> > people and a lengthy discussion to more or less agree and understand what
> > the authors of the document might have meant.
> 
> Disagree. The issue was not lack of clarity in the RFC text but the fact 
> that the spec provides *options* much of our discussion was implicitly 
> to do with justifying our choice amongst those options. In fact the text 
> which describes those options and tells us which one is appropriate for 
> our usage is very clear, once you are look at the relevant section. 
> Telling people we have picked the simple option and explicitly not 
> picked the other alternatives, and which is the relevant section of the 
> RFC to go look in to understand that, would help rather than hinder.

I think those documents could have been much simpler and clearer.
For instance, it wasn't clear to a number of us whether URIs are really a
subset of IRIs and the equivalence stuff, which I brought up. It is all
somewhere in those docs, but the excessive verbiage stands in the way.

I don't object to having pointers to the relevant sections (as you
suggested) somewhere in the endnotes to explain our choices. I do object to
having convoluted text inside the main body of the document, which would
send the reader to lengthy detours just to discover that we are talking
about something very trivial.

> However, to repeat, I'll drop this for now and return to it once we've 
> joined the dots to a concrete syntax.

ok.


	--michael
Received on Thursday, 19 April 2007 11:04:37 UTC