- From: Martin Duerst <duerst@w3.org>
- Date: Wed, 26 May 2004 18:02:58 +0900
- To: public-iri@w3.org
Dear IRI specialists, As part of the two-week mailing list last call, I have done one more reading through the spec. I'm listing this as the single issue editCleanup-35 and tentatively closing it. In my view, all of the items below are editorial. In case you think that any of these items need further discussion, please say so very soon. As a result, I have made the following edits, which I think should all be uncontroversial: - Moved the stuff in the Editorial Note just after the Abstract to the end of 1.1 (in part) and to the Acknowledgement section (in part) - Added some text at the end of 1.1 to provide a somewhat better overview of the document - Followed I-D/RFC Editor guidelines for abbreviations (expansion first, abbreviation in (), on first occurrence) - For point a) in applicability, changed: The protocol or format element used should be explicitly designated to carry IRIs. That is, the intent is not to introduce IRIs into contexts that are not defined to accept them. For example, XML schema [XMLSchema] has an explicit type "anyURI" that designates the use of IRIs. to: The protocol or format element where IRIs are used should be explicitly designated to be able to carry IRIs. That is, the intent is not to introduce IRIs into contexts that are not defined to accept them. For example, XML schema [XMLSchema] has an explicit type "anyURI" that includes IRIs and IRI references. Therefore, IRIs and IRI references can be in attributes and elements of type "anyURI". On the other hand, in the HTTP protocol [RFC2616], the Request URI is defined as an URI, which means that direct use of IRIs is not allowed in HTTP requests. I realized that this more explicit wording could have avoided some confusion in the discussion with Chris Haynes, and I hope it will reduce confusion for future readers. - Created a IANA consideration section saying "This document has no actions for IANA." (as per http://www.ietf.org/ID-Checklist.html) - Upper-cased one instance of 'internationalized resource identifier' for consistency. - Changed Step 1) of Section 3.1 from: This step generates a UCS-based character encoding from the original IRI format. to: This step generates a UCS character sequence from the original IRI format. This is to allign with Graham's comment on variant C) of that step at http://www.w3.org/International/iri-edit#3.1BC-norm-29 - In section 3.1, Step 2.2), changed Note: This is identical... to Note that this is identical... to avoid the impression that there might be some formatting problem. - In "Infrastructure accepting IRIs MAY convert the ireg-name component of an IRI as follows (before Step 2.2 above) for schemes that are known to use domain names in ireg-name, but where the scheme definition does not allow percent-encoding for ireg-name:", changed 'Step 2.2' to 'Step 2', because Step 2.2 is about single characters, which obviously is wrong (I think this mistake was introduced when I changed the step labeling to the clearer 2.2) from a simple 2). - Same for "The uniform treatment of the whole IRI in Step 2.2 above is important to not make processing dependent on URI scheme." - Fixed some (non-)escaping problems with two instances of Viet Nam. - In section 3.2, changed from c) The conversion may result in a character that is not appropriate in an IRI. See Section 6.1 for further details. to: c) The conversion may result in a character that is not appropriate in an IRI. See Section 2.2, Section 4.1, and Section 6.1 for further details. Rationale: syntax restrictions and bidi restrictions of course apply. Also, changed: 4) Re-percent-encode all octets produced in Step 3 that in UTF-8 represent characters that are not appropriate according to Section 4.1 and Section 6.1. to: 4) Re-percent-encode all octets produced in Step 3 that in UTF-8 represent characters that are not appropriate according to Section 2.2, Section 4.1, and Section 6.1. - Removed "The notation <hh> is used to denote octets outside those that can be represented in this document." because this is covered in Section 1.4 (Notation). - In Section 4.1, changed from "higher-order protocol" to "higher- level protocol", because that's the term used in the Unicode Bidi algorithm as well as in some other instance in the draft. - In section 5.2, changed "making sure that the case of the hexadecimal characters in the percent-encode is always the same" to "making sure that the case of the hexadecimal characters in the percent-encodeING is always the same" [uppercase only here] - In Section 6.1, changed "This section discusses limitations on characters and character sequences usable for IRIs." to "This section discusses limitations on characters and character sequences usable for IRIs beyond those given in Section 2.2 and Section 4.1." to make sure the reader does not forget the more basic syntax and bidi limitations. - At the end of the first paragraph of Section 6.4 (Use of UTF-8), added the sentence: For background information on encoding characters into URIs, see also Section 2.5 of [RFCYYYY]. This section is a very helpful addition to RFC 2396bis. - In section 7.2, changed from: For IRI input, the input method editor should be set so that it produces half-width Latin letters, and full-width Katakana. to: For IRI input, the input method editor should be set so that it produces half-width Latin letters AND PUNCTUATION, and full-width Katakana. [uppercase only here] This is rather important because all the reserved characters are punctuation characters. - In Section 7.8, changed from: Display software should be upgraded only after upgraded entry software has been widely deployed to the population that will see the displayed result. to: Software converting from URIs to IRIs for display should be upgraded only after upgraded entry software has been widely deployed to the population that will see the displayed result. Rationale: The previous wording also applied to display of IRIs as such, where it would in many cases have needed a software downgrade rather than a software upgrade. This wording was put in here quite early on, where the implicit assumption seems to made sense. - In the security section, simplified the sentence: Protocols and servers that allow the creation of resources with unnormalized names, and resources with names that are not normalized, are particularly vulnerable to such attacks. to: Protocols and servers that allow the creation of resources with names that are not normalized are particularly vulnerable to such attacks. to avoid a duplication. - Removed the URIs from references to RFCs. [wouldn't it be great if the IETF and the RFC editor would commit to more stable URIs so that we could make use of them, for the benefits of everybody?] - Changed the Note to RFC Editor for [RFCYYYY] so that it appears in the .txt version. - Updated several references. By upgrading the reference to XML from the second to the third edition, was able to get rid of the Erratum pointer. Fixed the URI for XML Namespaces. - Fixed a double mention of the same person in the Acknowledgements Regards, Martin.
Received on Wednesday, 26 May 2004 05:23:35 UTC