- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Tue, 4 Jan 2005 17:10:22 -0800
- To: uri <uri@w3.org>
Here are my comments on draft-hansen-2717bis-2718bis-uri-guidelines-02: The abstract should simply state what the document is and should spell out the acronym, as in This document provides guidelines, recommendations, and a mechanism for the definition and registration of Uniform Resource Identifier (URI) schemes. and the second sentence ["The registration requirements have been simplified by providing for provisional registrations that need no technical review and may share names with existing scheme names.] should be deleted because it makes no sense outside current discussion. 1. Introduction A Uniform Resource Identifier (URI) is a compact string representation for identifying resources. RFC XXXX [6] defines the general syntax of URIs. Whoa, we don't want to spend another three years arguing over the precise wording of the definition of URIs, do we? So, don't try to rephrase the definition (incorrectly) here. Start with The Uniform Resource Identifier (URI) protocol element and generic syntax is defined by RFC XXXX [6]. Each URI begins with a scheme name, as defined in Section 3.1 of RFC XXXX, that refers to a specification for assigning identifiers within that scheme. As such, the URI syntax is a federated and extensible naming system wherein each scheme's specification may further restrict the syntax and semantics of identifiers using that scheme. This document provides guidelines for the definition of new URI schemes, for consideration by those who are defining, registering, or evaluating those definitions, as well as a process and mechanism for registering URI schemes within the IANA URI scheme registry [ref]. This document obsoletes both RFCs 2717 [2] and 2718 [3]. ====== The original terminology for the URI protocol element attempted to ... Er, that terminology was added two years after the original URI. Try RFCs 2717 and 2718 draw a distinction between 'locators' -- identifiers used for accessing resources available on the Internet, and 'names' -- identifiers used for naming possibly abstract resources, independent of any mechanism for accessing them. The intent was to use the designation "URL" (Uniform Resource Locator) for those identifiers that were locators, and "URN" (Uniform Resource Name) for those identifiers that were names. In practice, the line between 'locator' and 'name' has been difficult to draw: locators can be used as names, and names can be used as locators. ==== As a result, recent documents have used the term "URI" for all resource identifiers, avoiding the term "URL", and reserving the term "URN" explicitly for those URIs using the "urn" scheme name (RFC 2141 [1]). URNs remain a distinct class of URIs because of the requirements set out in RFC 3406 [8]; this document's procedures do not update or supersede the procedures set out in RFC 3406. Stick to the facts, please -- the only thing distinct about URNs is the scheme name and that scheme will be in this registry. How about As a result, recent documents have used the term "URI" for all resource identifiers, avoiding the term "URL", and reserving the term "URN" explicitly for those URIs using the "urn" scheme name (RFC 2141 [1]). URN "namespaces" (RFC 3406 [8]) are specific to the "urn" scheme and outside the scope of this document. ==== RFC 2717 defined a set of registration trees in which URI schemes could be registered, one of which was called the IETF Tree, to be managed by IANA. RFC 2717 proposed that additional registration trees might be approved by the IESG, however, no such registration trees have been approved. should be "... the IESG. However, ...". This document eliminates RFC 2717's distinction between different 'trees' for URI schemes; instead there is a single namespace for registered values. Within that namespace, there are values that are approved as meeting a set of criteria for URI schemes. Other scheme names may also be registered provisionally or without necessarily passing any review process or criteria. should be "... provisionally, without necessarily ..." ==== 2.2 Syntactic compatibility RFC XXXX [6] defines a generic syntax for URI schemes with hierarchical components and a naming authority. New URI schemes should follow this syntax. Not strong enough. It should say RFC XXXX [6] defines the generic syntax for all URI schemes, along with the syntax of common URI components that are used by many URI schemes to define hierarchical identifiers. All URI scheme specifications must define their own syntax such that all strings matching their scheme-specific syntax will also match the <absolute-URI> grammar described in Section 4.3 of RFC XXXX. New URI schemes should reuse the common URI components of RFC XXXX for the definition of hierarchical naming schemes. However, if there is a strong reason for a URI scheme to not use the hierarchical syntax, then the new scheme definition should at least follow the syntax of previously registered schemes, if possible. URI schemes that are not intended for use with relative URIs should avoid use of the forward slash "/" character, which is used for hierarchical delimiters, and the complete path segments "." and ".." (dot-segments). Avoid improper use of "//". The use of double slashes in the first part of a URI is not an artistic indicator that what follows is a URI: Double slashes are used ONLY when the syntax of the URI's <scheme-specific-part> contains a hierarchical structure as described in RFC XXXX. In URIs from such schemes, the use of double slashes indicates that what follows is the top hierarchical element for a naming authority. (See section ???? of RFC XXXX for more details.) URI schemes that do not contain a conformant hierarchical structure in their <scheme-specific-part> should not use double slashes following the "<scheme>:" string. ==== 2.3 Well-Defined ... In many cases, new URI schemes are defined as ways to translate other protocols and name spaces into the general framework of URIs. For example, the "ftp" URI scheme translates from the FTP protocol, while the "mid" URI scheme translates from the Message-ID field of messages. For such schemes, the description of the mapping must be complete, must describe how characters get encoded or not in URIs, must describe exactly how all legal values of the base standard can be represented using the URI scheme, and exactly which modifiers, alternate forms and other artifacts from the base standards are included or not included. These requirements are elaborated below. While that description is appealing, it is also wrong. In fact, the "ftp" URI scheme does not "translate" from the FTP protocol -- what it does is map identifiers to the specific interface of an FTP/TCP/IP server. FTP (the base standard) is far more capable than the limited set of resources that can be identified via the "ftp" URI. In fact, the paragraph following it is more accurate for all locator schemes: In some cases, URI schemes do not have particular network protocols associated with them, because their use as a locator is limited to contexts where the access method is understood. This is the case, for example, with the "cid" and "mid" URI schemes. For these URI schemes, the specification should describe the notation of the scheme, the contexts of use, and a complete mapping of the locator from its source. In other words, the mapping is always from locator to source. ==== 2.4 Definition of operations In addition to the definition of how a URI identifies a resource, a URI scheme definition should also define, if applicable, the set of operations that may be performed on a resource using the URI as its identifier. The basis for this model is HTTP; a HTTP resource can be operated on by GET, POST, PUT and a number of other operations available through the HTTP protocol. The URI scheme definition should describe all well-defined operations on the URI identifier, and what they are supposed to do. I think the middle sentence should be a "For example, ..." type -- there is no need to frame this as the HTTP model. It is true of all IR protocols. Some URI schemes (for example, "telnet") provide location information for hooking onto bi-directional data streams, and don't fit the "infoaccess" paradigm of most URIs very well; this should be documented. There is way too much context hidden here. It should just provide an alternative example, specifically that of telnet, in which the only operation defined is to initiate the connection and login. Likewise, I suggest providing an example of a scheme that has no defined operations. ===== 2.5 Character encoding When describing URI schemes in which (some of) the elements of the URI are actually representations of sequences of characters, care ... should be "actually representations of human-readable text, care ..." should be taken not to introduce unnecessary variety in the ways in which characters are encoded into octets and then into URI characters. Unless there is some compelling reason for a particular scheme to do otherwise, translating character sequences into UTF-8 (RFC 2279 [4]) and then subsequently using the %HH encoding for unsafe octets is recommended. unsafe octets is a leftover -- I suggest referring to section 2.5 of RFC XXXX instead. ==== 2.6 Clear security considerations Add o Carefully read and understand the security considerations described in Section 7 of RFC XXXX and note any that apply to the new scheme. ==== 2.7 Scheme Name considerations Shouldn't this quote the ABNF definition in RFC XXXX and specifically note that schemes must be registered as lowercase? ==== 3. URI Scheme Registration Procedure 3.1 General ... Provisional status is useful for registering legacy URI schemes that have already been widely deployed without registration, and for which review at this time would be inappropriate. Provisional status may also be useful for private or experimental use. Permanent status is intended for use by IETF standards-track protocols. The status requires a substantive review and approval process. I would reverse the order of these two paragraphs -- standards-track should always go first. I would think that permanent status would apply to any specification approved by the IESG, not just standards-track, and indeed ... ... Permanent registration of a URI scheme requires IETF review and IESG approval. In many cases, permanent registration involves the promotion of an existing provisional registration. In general, the creation of a new permanent URI scheme requires a Standards Track RFC. In some cases, a URI scheme registration in an Informational RFC may be approved by the IESG for 'permanent' URI registration. This is way too vague! IANA provides in RFC 2434 the list of all levels of review that might be applied. All we need to do is list the ones we need, using the same terminology as provided by IANA, along with the policies for allocation. The entire registration policy (aside from the templates) can be defined in one paragraph. In fact, skipping through the rest of the spec indicates that RFC 2434 is completely absent (it should be a normative reference) and the remaining document needs to incorporate its terminology for policies -- that should cut the length and make it much easier for IANA to review and apply. Cheers, Roy T. Fielding <http://roy.gbiv.com/> Chief Scientist, Day Software <http://www.day.com/>
Received on Wednesday, 5 January 2005 01:10:29 UTC