- From: Graham Klyne <gk@ninebynine.org>
- Date: Thu, 19 Jun 2003 10:45:40 +0100
- To: uri@w3.org
This message is prompted by some questions/comments I received from someone
recently. I've not yet read the -03 draft in detail, so I may have
overlooked some material.
...
(1) hierarchical and non-hierarchical URIs
I notice that in -03 the opaque-part syntax distinction has been
dropped. My concern is that it may now not be clear when
relative-to-absolute URI conversion should take part of hierarchical '/'
characters in the URIs. Previously, my understanding was that an algorithm
would look at the path component of the base URI and, if it starts with a
'/', assume the base and relative URIs are hierarchical and apply the
path-merging logic; otherwise, the opaque-part in the base URI is used in
all-or-nothing fashion depending on what is present in the "relative" URI.
Section 3 says "a non-hierarchical path will be treated as opaque data by a
generic URI parser", but it's not clear at this point what constitutes a
"non-hierarchical path".
Section 3.3 says " A path is always defined for a URI, though the defined
path may be empty (zero length) or opaque (not containing any "/"
delimiters)", which suggests that an opaque path may not contain *any*
un-escaped '/' characters. This seems like an onerous restriction, and in
conflict with existing URI scheme usage; e.g. news: according to IANA is
currently specified by RFC 1738, and has:
[[
A <newsgroup-name> is a period-delimited hierarchical name, such as
"comp.infosystems.www.misc". A <message-id> corresponds to the
Message-ID of section 2.1.5 of RFC 1036, without the enclosing "<"
and ">"; it takes the form <unique>@<full_domain_name>. A message
identifier may be distinguished from a news group name by the
presence of the commercial at "@" character. No additional characters
are reserved within the components of a news URL.
]]
-- http://www.faqs.org/rfcs/rfc1738.html
or mid:, defined by RFC 2392, which is clearly non-hierarchical, but:
[[
mid-url = "mid" ":" message-id [ "/" content-id ]
]]
-- http://www.faqs.org/rfcs/rfc2392.html
Other non-hier URI schemes using '/' are:
service: http://www.faqs.org/rfcs/rfc2609.html
My suggestion would be to add a brief comment in section 3 clarifying the
intent (as to what constitutes a hierarchical URI). My preference is that
an absolute URI with net-path or abs-path form is hierarchical, otherwise not.
...
(2) square brackets
Is it necessary for square brackets to be reserved outside the net-path
component? I personally use them quite often in fragment identifiers for
references. My correspondent had another use for them in the path
component of a URI scheme. I think there are several instances of them
occurring (unescaped) in message-IDs, and the mid: spec doesn't require
them to be escaped. I think this also applies to the ldap: scheme.
-- http://www.faqs.org/rfcs/rfc2392.html
-- http://www.faqs.org/rfcs/rfc2255.html
...
#g
-------------------
Graham Klyne
<GK@NineByNine.org>
PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E
Received on Thursday, 19 June 2003 07:58:31 UTC