- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Sun, 12 Dec 2004 14:53:49 -0800
- To: uri <uri@w3.org>
I prepared an nroff version of the URI specification for publication
as an RFC (once the RFC editor assigns a number) and in the process
made all of the AUTH48 changes that were requested.
Below is a diff of all the changes aside from pagination/index/toc
since draft 07. Let me know if you don't think this reflects the
group and IESG consensus.
Cheers,
Roy T. Fielding <http://roy.gbiv.com/>
Chief Scientist, Day Software <http://www.day.com/>
=====================
--- d07.txt Sun Dec 12 14:40:38 2004
+++ dRFC.txt Sun Dec 12 14:42:48 2004
@@ -12,11 +12,6 @@
not define a generative grammar for URIs; that task is performed by
the individual specifications of each URI scheme.
-Editorial Note
-
- Discussion of this draft and comments to the editors should be sent
- to the uri@w3.org mailing list. An issues list and version history
- is available at
<http://gbiv.com/protocols/uri/rev-2002/issues.html>.
1. Introduction
@@ -127,7 +122,7 @@
that identify in relation to the end-user's local context should
only
be used when the context itself is a defining aspect of the
resource,
such as when an on-line help manual refers to a file on the
- end-user's filesystem (e.g., "file:///etc/hosts").
+ end-user's file system (e.g., "file:///etc/hosts").
1.1.1 Generic Syntax
@@ -464,10 +459,12 @@
characters in the reserved set allowed within that component.
URI producing applications should percent-encode data octets that
- correspond to characters in the reserved set. However, if a
reserved
- character is found in a URI component and no delimiting role is
known
- for that character, then it should be interpreted as representing
the
- data octet corresponding to that character's encoding in US-ASCII.
+ correspond to characters in the reserved set unless said characters
+ are specifically allowed by the URI scheme to represent data in that
+ component. If a reserved character is found in a URI component and
+ no delimiting role is known for that character, then it must be
+ interpreted as representing the data octet corresponding to that
+ character's encoding in US-ASCII.
2.3 Unreserved Characters
@@ -555,9 +552,9 @@
than simply percent-encoding the original octets.
For example, consider an information service that provides data,
- stored locally using an EBCDIC-based filesystem, to clients on the
+ stored locally using an EBCDIC-based file system, to clients on the
Internet through an HTTP server. When an author creates a file on
- that filesystem with the name "Laguna Beach", their expectation is
+ that file system with the name "Laguna Beach", their expectation is
that the "http" URI corresponding to that resource would also
contain
the meaningful string "Laguna%20Beach". If, however, that server
produces URIs using an overly-simplistic raw octet mapping, then the
@@ -578,7 +575,7 @@
the component and producing the URI.
When a new URI scheme defines a component that represents textual
- data consisting of characters from the Unicode character set [UCS],
+ data consisting of characters from the Universal Character Set
[UCS],
the data should be encoded first as octets according to the UTF-8
character encoding [STD63], and then only those octets that do not
correspond to characters in the unreserved set should be
@@ -750,26 +747,29 @@
The version flag does not indicate the IP version; rather, it
indicates future versions of the literal format. As such,
- implementations must not provide the version flag for existing IPv4
- and IPv6 literal addresses. If a URI containing an IP-literal that
- starts with "v" (case-insensitive), indicating that the version flag
- is present, is dereferenced by an application that does not know the
- meaning of that version flag, then the application should return an
- appropriate error for "address mechanism not supported".
+ implementations must not provide the version flag for the existing
+ IPv4 and IPv6 literal address forms described below. If a URI
+ containing an IP-literal that starts with "v" (case-insensitive),
+ indicating that the version flag is present, is dereferenced by an
+ application that does not know the meaning of that version flag,
then
+ the application should return an appropriate error for "address
+ mechanism not supported".
A host identified by an IPv6 literal address is represented inside
the square brackets without a preceding version flag. The ABNF
provided here is a translation of the text definition of an IPv6
- literal address provided in [RFC3513]. A 128-bit IPv6 address is
- divided into eight 16-bit pieces. Each piece is represented
- numerically in case-insensitive hexadecimal, using one to four
- hexadecimal digits (leading zeroes are permitted). The eight
encoded
- pieces are given most-significant first, separated by colon
- characters. Optionally, the least-significant two pieces may
instead
- be represented in IPv4 address textual format. A sequence of one or
- more consecutive zero-valued 16-bit pieces within the address may be
- elided, omitting all their digits and leaving exactly two
consecutive
- colons in their place to mark the elision.
+ literal address provided in [RFC3513]. This syntax does not support
+ IPv6 scoped addressing zone identifiers.
+
+ A 128-bit IPv6 address is divided into eight 16-bit pieces. Each
+ piece is represented numerically in case-insensitive hexadecimal,
+ using one to four hexadecimal digits (leading zeroes are permitted).
+ The eight encoded pieces are given most-significant first, separated
+ by colon characters. Optionally, the least-significant two pieces
+ may instead be represented in IPv4 address textual format. A
+ sequence of one or more consecutive zero-valued 16-bit pieces within
+ the address may be elided, omitting all their digits and leaving
+ exactly two consecutive colons in their place to mark the elision.
IPv6address = 6( h16 ":" ) ls32
/ "::" 5( h16 ":" ) ls32
@@ -1576,25 +1576,26 @@
spiders and indexing engines to prune a search space or reduce
duplication of request actions and response storage.
- URI comparison is performed in respect to some particular purpose,
- and implementations with differing purposes will often be subject to
- differing design trade-offs in regards to how much effort should be
- spent in reducing aliased identifiers. This section describes a
- variety of methods that may be used to compare URIs, the trade-offs
- between them, and the types of applications that might use them.
+ URI comparison is performed in respect to some particular purpose.
+ Protocols or implementations that compare URIs for different
purposes
+ will often be subject to differing design trade-offs in regards to
+ how much effort should be spent in reducing aliased identifiers.
+ This section describes a variety of methods that may be used to
+ compare URIs, the trade-offs between them, and the types of
+ applications that might use them.
6.1 Equivalence
Since URIs exist to identify resources, presumably they should be
considered equivalent when they identify the same resource.
However,
such a definition of equivalence is not of much practical use, since
- there is no way for an implementation to compare two resources that
- are not under its own control. For this reason, determination of
- equivalence or difference of URIs is based on string comparison,
- perhaps augmented by reference to additional rules provided by URI
- scheme definitions. We use the terms "different" and "equivalent"
to
- describe the possible outcomes of such comparisons, but there are
- many application-dependent versions of equivalence.
+ there is no way for an implementation to compare two resources
unless
+ it has full knowledge or control of them. For this reason,
+ determination of equivalence or difference of URIs is based on
string
+ comparison, perhaps augmented by reference to additional rules
+ provided by URI scheme definitions. We use the terms "different"
and
+ "equivalent" to describe the possible outcomes of such comparisons,
+ but there are many application-dependent versions of equivalence.
Even though it is possible to determine that two URIs are
equivalent,
URI comparison is not sufficient to determine if two URIs identify
@@ -1872,8 +1873,8 @@
application is not expecting to receive raw data within a component.
Special care should be taken when the URI path interpretation
process
- involves the use of a back-end filesystem or related system
- functions. Filesystems typically assign an operational meaning to
+ involves the use of a back-end file system or related system
+ functions. File systems typically assign an operational meaning to
special characters, such as the "/", "\", ":", "[", and "]"
characters, and special device names like ".", "..", "...", "aux",
"lpt", etc. In some cases, merely testing for the existence of such
@@ -1986,8 +1987,8 @@
[RFC2234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", RFC 2234, November 1997.
- [STD63] Yergeau, F., "UTF-8, a transformation format of ISO
- 10646", STD 63, RFC 3629, November 2003.
+ [STD63] Yergeau, F., "UTF-8, a transformation format of
+ ISO 10646", STD 63, RFC 3629, November 2003.
[UCS] International Organization for Standardization,
"Information Technology - Universal Multiple-Octet Coded
@@ -2011,8 +2012,8 @@
and Support", STD 3, RFC 1123, October 1989.
[RFC1535] Gavron, E., "A Security Problem and Proposed Correction
- With Widely Deployed DNS Software", RFC 1535, October
- 1993.
+ With Widely Deployed DNS Software", RFC 1535,
+ October 1993.
[RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW:
A
Unifying Syntax for the Expression of Names and Addresses
@@ -2028,8 +2029,8 @@
[RFC1738] Berners-Lee, T., Masinter, L. and M. McCahill, "Uniform
Resource Locators (URL)", RFC 1738, December 1994.
- [RFC1808] Fielding, R., "Relative Uniform Resource Locators", RFC
- 1808, June 1995.
+ [RFC1808] Fielding, R., "Relative Uniform Resource Locators",
+ RFC 1808, June 1995.
[RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types", RFC 2046,
@@ -2055,8 +2056,8 @@
[RFC2732] Hinden, R., Carpenter, B. and L. Masinter, "Format for
Literal IPv6 Addresses in URL's", RFC 2732, December
1999.
- [RFC3305] Mealling, M. and R. Denenberg, "Report from the Joint
W3C/
- IETF URI Planning Interest Group: Uniform Resource
+ [RFC3305] Mealling, M. and R. Denenberg, "Report from the Joint
+ W3C/IETF URI Planning Interest Group: Uniform Resource
Identifiers (URIs), URLs, and Uniform Resource Names
(URNs): Clarifications and Recommendations", RFC 3305,
August 2002.
@@ -2457,12 +2458,3 @@
normalization of URIs in practice. This change only impacts the
parsing of abnormal references and same-scheme references wherein
the base URI has a non-hierarchical path.
-
-Appendix E. Instructions to RFC Editor
-
- Prior to publication as an RFC, please remove this section and the
- "Editorial Note" that appears after the Abstract. If [BCP35] or any
- of the normative references are updated prior to publication, the
- associated reference in this document can be safely updated as well.
- This document has been produced using the xml2rfc tool set; the XML
- version can be obtained via the URI listed in the editorial note.
=====================
Received on Sunday, 12 December 2004 22:54:24 UTC