- From: Martin Duerst <duerst@w3.org>
- Date: Thu, 13 May 2004 14:20:21 +0900
- To: Graham Klyne <gk@ninebynine.org>, public-iri@w3.org
- Cc: uri@w3.org
Hello Graham, I have added the text that you propose below, and appended the following: The grammar is split into two parts, rules that differ from [RFC2396bis] because of the above-mentioned expansion, and rules that are the same as in [RFC2396bis]. For rules that are different than in [RFC2396bis], the names of the non-terminals have been changed as follows: If the non-terminal contains 'URI', this has been changed to 'IRI'. Otherwise, an 'i' has been prefixed. I have tentatively closed this comment. Regards, Martin. At 10:37 04/05/12 +0100, Graham Klyne wrote: >Martin, > >Thanks for your response. Where you say: >[[ >I think what you mean is to only change some terminal productions, >but leave the rest unchanged. There are several reasons why this >hasn't been done: >]] >This was roughly what I had in mind, but I can see why you prefer not to >do this. This is primarily a matter of preference and perspective, and I >am content to accept your decision. Your point about URI vs IRI is well-made. > >Simply adding a brief descriptive paragraph capturing the thrust of this: >[[ >- In the most general sense, it would have been okay to just > add ucschar to unreserved. But there is the issue of allowing > private characters in query parts, but not elsewhere. >]] >in the introduction to the section about the grammar would have helped me >to see quickly where the changes were; e.g. >[[ >The following grammar closely follows the URI grammar in [RFC2396bis], >except that the range of unreserved characters is expanded to include UCS >characters, with the restriction that private UCS characters can occur >only in query parts and not elsewhere. >]] > >#g >-- > >At 10:02 12/05/04 +0900, Martin Duerst wrote: >>Hello Graham, >> >>Many thanks for your comments. I'm responding in pieces, >>as I split things up into issues. This is issue >>http://www.w3.org/International/iri-edit/#IRIsyntax-28 >> >>At 12:02 04/05/10 +0100, Graham Klyne wrote: >> >>>Martin, >>> >>>These comments are based on a quick skim rather than a detailed reading. >>> >>>Looking at this from an implementer's perspective, I feel it would be >>>helpful if the relationship between the IRI and URI *grammars* were more >>>clearly delineated; e.g. a presentation of IRI syntax that is based on >>>the RFC2396bis grammar, >> >>It definitely is. I have very carefully followed all the changes >>in the RFC2396bis grammar. I very much hope I got it right, but >>any additional crosschecks would be greatly appreciated. >> >> >>>replacing a minimum number of productions. >> >>I think that's also the case. >> >>I think what you mean is to only change some terminal productions, >>but leave the rest unchanged. There are several reasons why this >>hasn't been done: >> >>- In the most general sense, it would have been okay to just >> add ucschar to unreserved. But there is the issue of allowing >> private characters in query parts, but not elsewhere. >>- Looking top down, what you seem to want may actually imply >> to use 'URI' rather than 'IRI' at the start of the grammar. >> I think that would have been more confusing than helpful. >> And once you start making these distinctions, it would >> then again be confusing to e.g. use 'path' rather than >> 'ipath'. >>- The current solution allows to very easily create a combined >> grammar of URIs and IRIs, which some people might want to do. >> If the two grammars would use the same symbols for things that >> are on purpose the same, but in actual syntax are different, >> that wouldn't work. >>- The grammar allows somebody who wants to only implement IRIs >> to do that directly. It may lead to less implementation >> divergence than a verbal description of differences against >> another spec. >> >> >>>On this basis, it would be easier to see what needs to be changed in a >>>URI parser to yield an IRI parser. >> >>I hope it's already easy enough to see. I have used very >>straightforward naming conventions to correlate the corresponding >>nonterminals in both grammars. If there is 'URI' in the original >>nonterminal, I have changed that to 'IRI'. Otherwise, I have >>prefixed an 'i'. If you think this is helpful, I can add a >>sentence to that effect. >> >> >>>Also, I note that the RFC2396bis grammar has been through several >>>revisions as subtle issues are exposed by review and implementation >>>experience; by replicating the entire grammar (rather than saying that >>>an IRI is like a URI with designated changes), can you be confident that >>>such issues have not been re-introduced? >> >>Of course, there is never any absolute confidence, but as far as >>I remember, these issues were all related to details around >>special cases such as e.g. empty paths. They are therefore >>orthogonal to the issue of extending the repertoire of >>'reserved' characters. One might be able to immagine some >>interactions if e.g. authorities and paths were extended >>in a different way, but the difference is for query parts, >>which are very clearly delineated, and haven't been at >>issue in the bugs you mention above. >> >>I hope you are satisfied with this answer. If not, I would >>appreciate a more detailled proposal. >> >> >>Regards, Martin. > >------------ >Graham Klyne >For email: >http://www.ninebynine.org/#Contact
Received on Thursday, 13 May 2004 11:22:59 UTC