- From: Martin Duerst <duerst@w3.org>
- Date: Wed, 26 May 2004 18:02:58 +0900
- To: public-iri@w3.org
Dear IRI specialists,
As part of the two-week mailing list last call, I have done one more
reading through the spec. I'm listing this as the single issue
editCleanup-35 and tentatively closing it. In my view, all of the
items below are editorial. In case you think that any of these items
need further discussion, please say so very soon.
As a result, I have made the following edits, which I think should
all be uncontroversial:
- Moved the stuff in the Editorial Note just after the Abstract to
the end of 1.1 (in part) and to the Acknowledgement section (in part)
- Added some text at the end of 1.1 to provide a somewhat better
overview of the document
- Followed I-D/RFC Editor guidelines for abbreviations (expansion
first, abbreviation in (), on first occurrence)
- For point a) in applicability, changed:
The protocol or format element used should be explicitly designated
to carry IRIs. That is, the intent is not to introduce IRIs into
contexts that are not defined to accept them. For example, XML schema
[XMLSchema] has an explicit type "anyURI" that designates the use of IRIs.
to:
The protocol or format element where IRIs are used should be explicitly
designated to be able to carry IRIs. That is, the intent is not to
introduce IRIs into contexts that are not defined to accept them.
For example, XML schema [XMLSchema] has an explicit type "anyURI"
that includes IRIs and IRI references. Therefore, IRIs and IRI references
can be in attributes and elements of type "anyURI".
On the other hand, in the HTTP protocol [RFC2616], the Request URI is
defined as an URI, which means that direct use of IRIs is not allowed
in HTTP requests.
I realized that this more explicit wording could have avoided some
confusion in the discussion with Chris Haynes, and I hope it will
reduce confusion for future readers.
- Created a IANA consideration section saying
"This document has no actions for IANA."
(as per http://www.ietf.org/ID-Checklist.html)
- Upper-cased one instance of 'internationalized resource identifier'
for consistency.
- Changed Step 1) of Section 3.1 from:
This step generates a UCS-based character encoding from
the original IRI format.
to:
This step generates a UCS character sequence from
the original IRI format.
This is to allign with Graham's comment on variant C) of that step
at http://www.w3.org/International/iri-edit#3.1BC-norm-29
- In section 3.1, Step 2.2), changed
Note: This is identical...
to
Note that this is identical...
to avoid the impression that there might be some formatting problem.
- In "Infrastructure accepting IRIs MAY convert the ireg-name
component of an IRI as follows (before Step 2.2 above) for schemes
that are known to use domain names in ireg-name, but where the
scheme definition does not allow percent-encoding for ireg-name:",
changed 'Step 2.2' to 'Step 2', because Step 2.2 is about single
characters, which obviously is wrong (I think this mistake was
introduced when I changed the step labeling to the clearer
2.2) from a simple 2).
- Same for "The uniform treatment of the whole IRI in Step 2.2 above
is important to not make processing dependent on URI scheme."
- Fixed some (non-)escaping problems with two instances of Viet Nam.
- In section 3.2, changed from
c) The conversion may result in a character that is not appropriate in an
IRI. See Section 6.1 for further details.
to:
c) The conversion may result in a character that is not appropriate in an
IRI. See Section 2.2, Section 4.1, and Section 6.1 for further details.
Rationale: syntax restrictions and bidi restrictions of course apply.
Also, changed:
4) Re-percent-encode all octets produced in Step 3 that in UTF-8
represent characters that are not appropriate according to
Section 4.1 and Section 6.1.
to:
4) Re-percent-encode all octets produced in Step 3 that in UTF-8
represent characters that are not appropriate according to
Section 2.2, Section 4.1, and Section 6.1.
- Removed "The notation <hh> is used to denote octets outside those
that can be represented in this document." because this is covered
in Section 1.4 (Notation).
- In Section 4.1, changed from "higher-order protocol" to "higher-
level protocol", because that's the term used in the Unicode Bidi
algorithm as well as in some other instance in the draft.
- In section 5.2, changed "making sure that the case of the hexadecimal
characters in the percent-encode is always the same" to
"making sure that the case of the hexadecimal
characters in the percent-encodeING is always the same"
[uppercase only here]
- In Section 6.1, changed "This section discusses limitations on characters
and character sequences usable for IRIs." to "This section discusses
limitations on characters and character sequences usable for IRIs
beyond those given in Section 2.2 and Section 4.1."
to make sure the reader does not forget the more basic syntax and bidi
limitations.
- At the end of the first paragraph of Section 6.4 (Use of UTF-8),
added the sentence:
For background information on encoding characters into URIs, see
also Section 2.5 of [RFCYYYY].
This section is a very helpful addition to RFC 2396bis.
- In section 7.2, changed from:
For IRI input, the input method editor should be set so that it produces
half-width Latin letters, and full-width Katakana.
to:
For IRI input, the input method editor should be set so that it produces
half-width Latin letters AND PUNCTUATION, and full-width Katakana.
[uppercase only here]
This is rather important because all the reserved characters are
punctuation characters.
- In Section 7.8, changed from:
Display software should be upgraded only after upgraded entry software
has been widely deployed to the population that will see the displayed
result.
to:
Software converting from URIs to IRIs for display should be upgraded
only after upgraded entry software has been widely deployed to the
population that will see the displayed result.
Rationale: The previous wording also applied to display of IRIs as such,
where it would in many cases have needed a software downgrade rather
than a software upgrade. This wording was put in here quite early
on, where the implicit assumption seems to made sense.
- In the security section, simplified the sentence:
Protocols and servers that allow the creation of resources with
unnormalized names, and resources with names that are not normalized,
are particularly vulnerable to such attacks.
to:
Protocols and servers that allow the creation of resources with
names that are not normalized are particularly vulnerable to such
attacks.
to avoid a duplication.
- Removed the URIs from references to RFCs.
[wouldn't it be great if the IETF and the RFC editor would commit
to more stable URIs so that we could make use of them, for the
benefits of everybody?]
- Changed the Note to RFC Editor for [RFCYYYY] so that it appears
in the .txt version.
- Updated several references. By upgrading the reference to XML from
the second to the third edition, was able to get rid of the
Erratum pointer. Fixed the URI for XML Namespaces.
- Fixed a double mention of the same person in the Acknowledgements
Regards, Martin.
Received on Wednesday, 26 May 2004 05:23:35 UTC