- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Sun, 12 Dec 2004 14:53:49 -0800
- To: uri <uri@w3.org>
I prepared an nroff version of the URI specification for publication as an RFC (once the RFC editor assigns a number) and in the process made all of the AUTH48 changes that were requested. Below is a diff of all the changes aside from pagination/index/toc since draft 07. Let me know if you don't think this reflects the group and IESG consensus. Cheers, Roy T. Fielding <http://roy.gbiv.com/> Chief Scientist, Day Software <http://www.day.com/> ===================== --- d07.txt Sun Dec 12 14:40:38 2004 +++ dRFC.txt Sun Dec 12 14:42:48 2004 @@ -12,11 +12,6 @@ not define a generative grammar for URIs; that task is performed by the individual specifications of each URI scheme. -Editorial Note - - Discussion of this draft and comments to the editors should be sent - to the uri@w3.org mailing list. An issues list and version history - is available at <http://gbiv.com/protocols/uri/rev-2002/issues.html>. 1. Introduction @@ -127,7 +122,7 @@ that identify in relation to the end-user's local context should only be used when the context itself is a defining aspect of the resource, such as when an on-line help manual refers to a file on the - end-user's filesystem (e.g., "file:///etc/hosts"). + end-user's file system (e.g., "file:///etc/hosts"). 1.1.1 Generic Syntax @@ -464,10 +459,12 @@ characters in the reserved set allowed within that component. URI producing applications should percent-encode data octets that - correspond to characters in the reserved set. However, if a reserved - character is found in a URI component and no delimiting role is known - for that character, then it should be interpreted as representing the - data octet corresponding to that character's encoding in US-ASCII. + correspond to characters in the reserved set unless said characters + are specifically allowed by the URI scheme to represent data in that + component. If a reserved character is found in a URI component and + no delimiting role is known for that character, then it must be + interpreted as representing the data octet corresponding to that + character's encoding in US-ASCII. 2.3 Unreserved Characters @@ -555,9 +552,9 @@ than simply percent-encoding the original octets. For example, consider an information service that provides data, - stored locally using an EBCDIC-based filesystem, to clients on the + stored locally using an EBCDIC-based file system, to clients on the Internet through an HTTP server. When an author creates a file on - that filesystem with the name "Laguna Beach", their expectation is + that file system with the name "Laguna Beach", their expectation is that the "http" URI corresponding to that resource would also contain the meaningful string "Laguna%20Beach". If, however, that server produces URIs using an overly-simplistic raw octet mapping, then the @@ -578,7 +575,7 @@ the component and producing the URI. When a new URI scheme defines a component that represents textual - data consisting of characters from the Unicode character set [UCS], + data consisting of characters from the Universal Character Set [UCS], the data should be encoded first as octets according to the UTF-8 character encoding [STD63], and then only those octets that do not correspond to characters in the unreserved set should be @@ -750,26 +747,29 @@ The version flag does not indicate the IP version; rather, it indicates future versions of the literal format. As such, - implementations must not provide the version flag for existing IPv4 - and IPv6 literal addresses. If a URI containing an IP-literal that - starts with "v" (case-insensitive), indicating that the version flag - is present, is dereferenced by an application that does not know the - meaning of that version flag, then the application should return an - appropriate error for "address mechanism not supported". + implementations must not provide the version flag for the existing + IPv4 and IPv6 literal address forms described below. If a URI + containing an IP-literal that starts with "v" (case-insensitive), + indicating that the version flag is present, is dereferenced by an + application that does not know the meaning of that version flag, then + the application should return an appropriate error for "address + mechanism not supported". A host identified by an IPv6 literal address is represented inside the square brackets without a preceding version flag. The ABNF provided here is a translation of the text definition of an IPv6 - literal address provided in [RFC3513]. A 128-bit IPv6 address is - divided into eight 16-bit pieces. Each piece is represented - numerically in case-insensitive hexadecimal, using one to four - hexadecimal digits (leading zeroes are permitted). The eight encoded - pieces are given most-significant first, separated by colon - characters. Optionally, the least-significant two pieces may instead - be represented in IPv4 address textual format. A sequence of one or - more consecutive zero-valued 16-bit pieces within the address may be - elided, omitting all their digits and leaving exactly two consecutive - colons in their place to mark the elision. + literal address provided in [RFC3513]. This syntax does not support + IPv6 scoped addressing zone identifiers. + + A 128-bit IPv6 address is divided into eight 16-bit pieces. Each + piece is represented numerically in case-insensitive hexadecimal, + using one to four hexadecimal digits (leading zeroes are permitted). + The eight encoded pieces are given most-significant first, separated + by colon characters. Optionally, the least-significant two pieces + may instead be represented in IPv4 address textual format. A + sequence of one or more consecutive zero-valued 16-bit pieces within + the address may be elided, omitting all their digits and leaving + exactly two consecutive colons in their place to mark the elision. IPv6address = 6( h16 ":" ) ls32 / "::" 5( h16 ":" ) ls32 @@ -1576,25 +1576,26 @@ spiders and indexing engines to prune a search space or reduce duplication of request actions and response storage. - URI comparison is performed in respect to some particular purpose, - and implementations with differing purposes will often be subject to - differing design trade-offs in regards to how much effort should be - spent in reducing aliased identifiers. This section describes a - variety of methods that may be used to compare URIs, the trade-offs - between them, and the types of applications that might use them. + URI comparison is performed in respect to some particular purpose. + Protocols or implementations that compare URIs for different purposes + will often be subject to differing design trade-offs in regards to + how much effort should be spent in reducing aliased identifiers. + This section describes a variety of methods that may be used to + compare URIs, the trade-offs between them, and the types of + applications that might use them. 6.1 Equivalence Since URIs exist to identify resources, presumably they should be considered equivalent when they identify the same resource. However, such a definition of equivalence is not of much practical use, since - there is no way for an implementation to compare two resources that - are not under its own control. For this reason, determination of - equivalence or difference of URIs is based on string comparison, - perhaps augmented by reference to additional rules provided by URI - scheme definitions. We use the terms "different" and "equivalent" to - describe the possible outcomes of such comparisons, but there are - many application-dependent versions of equivalence. + there is no way for an implementation to compare two resources unless + it has full knowledge or control of them. For this reason, + determination of equivalence or difference of URIs is based on string + comparison, perhaps augmented by reference to additional rules + provided by URI scheme definitions. We use the terms "different" and + "equivalent" to describe the possible outcomes of such comparisons, + but there are many application-dependent versions of equivalence. Even though it is possible to determine that two URIs are equivalent, URI comparison is not sufficient to determine if two URIs identify @@ -1872,8 +1873,8 @@ application is not expecting to receive raw data within a component. Special care should be taken when the URI path interpretation process - involves the use of a back-end filesystem or related system - functions. Filesystems typically assign an operational meaning to + involves the use of a back-end file system or related system + functions. File systems typically assign an operational meaning to special characters, such as the "/", "\", ":", "[", and "]" characters, and special device names like ".", "..", "...", "aux", "lpt", etc. In some cases, merely testing for the existence of such @@ -1986,8 +1987,8 @@ [RFC2234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, November 1997. - [STD63] Yergeau, F., "UTF-8, a transformation format of ISO - 10646", STD 63, RFC 3629, November 2003. + [STD63] Yergeau, F., "UTF-8, a transformation format of + ISO 10646", STD 63, RFC 3629, November 2003. [UCS] International Organization for Standardization, "Information Technology - Universal Multiple-Octet Coded @@ -2011,8 +2012,8 @@ and Support", STD 3, RFC 1123, October 1989. [RFC1535] Gavron, E., "A Security Problem and Proposed Correction - With Widely Deployed DNS Software", RFC 1535, October - 1993. + With Widely Deployed DNS Software", RFC 1535, + October 1993. [RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: A Unifying Syntax for the Expression of Names and Addresses @@ -2028,8 +2029,8 @@ [RFC1738] Berners-Lee, T., Masinter, L. and M. McCahill, "Uniform Resource Locators (URL)", RFC 1738, December 1994. - [RFC1808] Fielding, R., "Relative Uniform Resource Locators", RFC - 1808, June 1995. + [RFC1808] Fielding, R., "Relative Uniform Resource Locators", + RFC 1808, June 1995. [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, @@ -2055,8 +2056,8 @@ [RFC2732] Hinden, R., Carpenter, B. and L. Masinter, "Format for Literal IPv6 Addresses in URL's", RFC 2732, December 1999. - [RFC3305] Mealling, M. and R. Denenberg, "Report from the Joint W3C/ - IETF URI Planning Interest Group: Uniform Resource + [RFC3305] Mealling, M. and R. Denenberg, "Report from the Joint + W3C/IETF URI Planning Interest Group: Uniform Resource Identifiers (URIs), URLs, and Uniform Resource Names (URNs): Clarifications and Recommendations", RFC 3305, August 2002. @@ -2457,12 +2458,3 @@ normalization of URIs in practice. This change only impacts the parsing of abnormal references and same-scheme references wherein the base URI has a non-hierarchical path. - -Appendix E. Instructions to RFC Editor - - Prior to publication as an RFC, please remove this section and the - "Editorial Note" that appears after the Abstract. If [BCP35] or any - of the normative references are updated prior to publication, the - associated reference in this document can be safely updated as well. - This document has been produced using the xml2rfc tool set; the XML - version can be obtained via the URI listed in the editorial note. =====================
Received on Sunday, 12 December 2004 22:54:24 UTC