Re: non-hierarchical URIs and square brackets from Roy T. Fielding on 2004-02-14 (uri@w3.org from February 2004)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Fri, 13 Feb 2004 18:06:07 -0800
To: Graham Klyne <gk@ninebynine.org>
Cc: uri@w3.org
Message-Id: <59D43034-5E92-11D8-8468-000393753936@gbiv.com>

On Thursday, June 19, 2003, at 02:45  AM, Graham Klyne wrote:
> (1) hierarchical and non-hierarchical URIs
>
> I notice that in -03 the opaque-part syntax distinction has been 
> dropped.  My concern is that it may now not be clear when 
> relative-to-absolute URI conversion should take part of hierarchical 
> '/' characters in the URIs.  Previously, my understanding was that an 
> algorithm would look at the path component of the base URI and, if it 
> starts with a '/', assume the base and relative URIs are hierarchical 
> and apply the path-merging logic;  otherwise, the opaque-part in the 
> base URI is used in all-or-nothing fashion depending on what is 
> present in the "relative" URI.

I've never used the base URI's path as an indicator.  The problem with
that theory is folks wanted to eliminate dot-segments from paths
regardless of reference type.

> Section 3 says "a non-hierarchical path will be treated as opaque data 
> by a generic URI parser", but it's not clear at this point what 
> constitutes a "non-hierarchical path".

fixed.

> Section 3.3 says " A path is always defined for a URI, though the 
> defined path may be empty (zero length) or opaque (not containing any 
> "/" delimiters)", which suggests that an opaque path may not contain 
> *any* un-escaped '/' characters.  This seems like an onerous 
> restriction, and in conflict with existing URI scheme usage; e.g. 
> news: according to IANA is currently specified by RFC 1738

fixed.

> (2) square brackets
>
> Is it necessary for square brackets to be reserved outside the 
> net-path component?  I personally use them quite often in fragment 
> identifiers for references.  My correspondent had another use for them 
> in the path component of a URI scheme.  I think there are several 
> instances of them occurring (unescaped) in message-IDs, and the mid: 
> spec doesn't require them to be escaped.  I think this also applies to 
> the ldap: scheme.
> -- http://www.faqs.org/rfcs/rfc2392.html

Says they must be percent-encoded (section 2).

> -- http://www.faqs.org/rfcs/rfc2255.html

    Note that any URL-illegal characters (e.g., spaces), URL special
    characters (as defined in section 2.2 of RFC 1738) and the reserved
    character '?' (ASCII 63) occurring inside a dn, filter, or other
    element of an LDAP URL MUST be escaped using the % method described
    in RFC 1738 [5]. If a comma character ',' occurs inside an extension
    value, the character MUST also be escaped using the % method.

Nope.  Square brackets have never been allowed in a URI before, so
allowing them in the path may lead to errors in implementations that
assumed they could use them internally.

....Roy

Received on Friday, 13 February 2004 21:05:16 UTC