W3C home > Mailing lists > Public > uri@w3.org > November 2004

More newly reserved characters per draft-fielding-uri-rfc2396bis-07

From: Bruce Lilly <blilly@erols.com>
Date: Wed, 17 Nov 2004 08:15:49 -0500
To: uri@w3.org
Message-Id: <200411170815.50396.blilly@erols.com>

The following unreserved characters per RFC 2396 are now reserved per the subject draft:
! exclamation point
' single quote
( left parenthesis
) right parenthesis
* asterisk

These characters were not only not reserved in earlier URI syntax, they were
explicitly unreserved by RFC 2396 section 2.3 (also RFC 1738 section 2.2).
These characters may be used unencoded per RFCs 1738 and 2396 in URIs
such as the following:


Both URIs are fully functional; the http scheme URI has worked with
every browser with which I've tried it, and has been used for
several years (with some slight variation in file location).

Moreover, RFCs 1738 and 2396 guarantee that URIs such as the above
examples which contain those (unreserved) characters are semantically
equivalent to versions in which one or more of those characters are
encoded. And RFC 1738 went even farther, forbidding reservation of
these characters in any scheme.  Because the current draft forbids
decoding any encoded representation of reserved characters during
normalization, formerly-equivalent URIs using percent-encoding of any
of these characters would lose their equivalence characteristic under
the draft rules.

These newly-reserved characters are now in the sub-delims category.
but there is no description of why they have been moved to the
reserved category, where they are expected to be used, or for what
purpose.  Appendix D, however says:

   Section 2 on characters has been rewritten to explain what characters
   are reserved, when they are reserved, and why they are reserved even
   when not used as delimiters by the generic syntax.  The mark
   characters that are typically unsafe to decode, including the
   exclamation mark ("!"), asterisk ("*"), single-quote ("'"), and open
   and close parentheses ("(" and ")"), have been moved to the reserved
   set in order to clarify the distinction between reserved and
   unreserved and hopefully answer the most common question of scheme

Although these characters are specifically mentioned, there is in fact no
explanation of when they are reserved or why they are reserved, and
the change from unreserved to reserved status of these characters
which have always been unreserved muddles, not clarifies, the
distinction between reserved and unreserved characters.
Received on Wednesday, 17 November 2004 17:30:26 UTC

This archive was generated by hypermail 2.4.0 : Sunday, 10 October 2021 22:17:46 UTC