Re: draft-duerst-iri-07.txt: 2 week mailing list last call (IRIsyntax-28) from Martin Duerst on 2004-05-13 (public-iri@w3.org from May 2004)

From: Martin Duerst <duerst@w3.org>
Date: Thu, 13 May 2004 14:20:21 +0900
To: Graham Klyne <gk@ninebynine.org>, public-iri@w3.org
Cc: uri@w3.org
Message-Id: <4.2.0.58.J.20040513141806.05ed2200@localhost>
Hello Graham,

I have added the text that you propose below, and appended the
following:
The grammar is split into two parts, rules that differ from
[RFC2396bis] because of the above-mentioned expansion, and rules
that are the same as in [RFC2396bis]. For rules that are different
than in [RFC2396bis], the names of the non-terminals have been
changed as follows: If the non-terminal contains 'URI', this has
been changed to 'IRI'. Otherwise, an 'i' has been prefixed.

I have tentatively closed this comment.

Regards,    Martin.

At 10:37 04/05/12 +0100, Graham Klyne wrote:

>Martin,
>
>Thanks for your response.  Where you say:
>[[
>I think what you mean is to only change some terminal productions,
>but leave the rest unchanged. There are several reasons why this
>hasn't been done:
>]]
>This was roughly what I had in mind, but I can see why you prefer not to 
>do this.  This is primarily a matter of preference and perspective, and I 
>am content to accept your decision.  Your point about URI vs IRI is well-made.
>
>Simply adding a brief descriptive paragraph capturing the thrust of this:
>[[
>- In the most general sense, it would have been okay to just
>   add ucschar to unreserved. But there is the issue of allowing
>   private characters in query parts, but not elsewhere.
>]]
>in the introduction to the section about the grammar would have helped me 
>to see quickly where the changes were; e.g.
>[[
>The following grammar closely follows the URI grammar in [RFC2396bis], 
>except that the range of unreserved characters is expanded to include UCS 
>characters, with the restriction that private UCS characters can occur 
>only in query parts and not elsewhere.
>]]
>
>#g
>--
>
>At 10:02 12/05/04 +0900, Martin Duerst wrote:
>>Hello Graham,
>>
>>Many thanks for your comments. I'm responding in pieces,
>>as I split things up into issues. This is issue
>>http://www.w3.org/International/iri-edit/#IRIsyntax-28
>>
>>At 12:02 04/05/10 +0100, Graham Klyne wrote:
>>
>>>Martin,
>>>
>>>These comments are based on a quick skim rather than a detailed reading.
>>>
>>>Looking at this from an implementer's perspective, I feel it would be 
>>>helpful if the relationship between the IRI and URI *grammars* were more 
>>>clearly delineated;  e.g. a presentation of IRI syntax that is based on 
>>>the RFC2396bis grammar,
>>
>>It definitely is. I have very carefully followed all the changes
>>in the RFC2396bis grammar. I very much hope I got it right, but
>>any additional crosschecks would be greatly appreciated.
>>
>>
>>>replacing a minimum number of productions.
>>
>>I think that's also the case.
>>
>>I think what you mean is to only change some terminal productions,
>>but leave the rest unchanged. There are several reasons why this
>>hasn't been done:
>>
>>- In the most general sense, it would have been okay to just
>>   add ucschar to unreserved. But there is the issue of allowing
>>   private characters in query parts, but not elsewhere.
>>- Looking top down, what you seem to want may actually imply
>>   to use 'URI' rather than 'IRI' at the start of the grammar.
>>   I think that would have been more confusing than helpful.
>>   And once you start making these distinctions, it would
>>   then again be confusing to e.g. use 'path' rather than
>>   'ipath'.
>>- The current solution allows to very easily create a combined
>>   grammar of URIs and IRIs, which some people might want to do.
>>   If the two grammars would use the same symbols for things that
>>   are on  purpose the same, but in actual syntax are different,
>>   that wouldn't work.
>>- The grammar allows somebody who wants to only implement IRIs
>>   to do that directly. It may lead to less implementation
>>   divergence than a verbal description of differences against
>>   another spec.
>>
>>
>>>On this basis, it would be easier to see what needs to be changed in a 
>>>URI parser to yield an IRI parser.
>>
>>I hope it's already easy enough to see. I have used very
>>straightforward naming conventions to correlate the corresponding
>>nonterminals in both grammars. If there is 'URI' in the original
>>nonterminal, I have changed that to 'IRI'. Otherwise, I have
>>prefixed an 'i'. If you think this is helpful, I can add a
>>sentence to that effect.
>>
>>
>>>Also, I note that the RFC2396bis grammar has been through several 
>>>revisions as subtle issues are exposed by review and implementation 
>>>experience;  by replicating the entire grammar (rather than saying that 
>>>an IRI is like a URI with designated changes), can you be confident that 
>>>such issues have not been re-introduced?
>>
>>Of course, there is never any absolute confidence, but as far as
>>I remember, these issues were all related to details around
>>special cases such as e.g. empty paths. They are therefore
>>orthogonal to the issue of extending the repertoire of
>>'reserved' characters. One might be able to immagine some
>>interactions if e.g. authorities and paths were extended
>>in a different way, but the difference is for query parts,
>>which are very clearly delineated, and haven't been at
>>issue in the bugs you mention above.
>>
>>I hope you are satisfied with this answer. If not, I would
>>appreciate a more detailled proposal.
>>
>>
>>Regards,   Martin.
>
>------------
>Graham Klyne
>For email:
>http://www.ninebynine.org/#Contact
Received on Thursday, 13 May 2004 11:22:59 UTC