W3C home > Mailing lists > Public > uri@w3.org > July 2005

STD 66 questions (problems ?)

From: Frank Ellermann <nobody@xyzzy.claranet.de>
Date: Mon, 25 Jul 2005 06:12:29 +0200
To: uri@w3.org
Message-ID: <42E466AD.13DA@xyzzy.claranet.de>

Hi, I've found two or three potential problems in RfC 3986 appendix D.2,
or I simply don't get the idea (still hunting those non-uric characters
known as "unsafe" in RfC 1738, or as <delims> and <unwise> in RfC 2396).

1 - uric, with appendix D.2 I get:

    1738 XCHAR: ALNUM ! $ % & ' ( ) * + , - . / : ; = ? @ _
    2396 URIC : ALNUM ! $ % & ' ( ) * + , - . / : ; = ? @ _ ~

    2396 URIC   : ALNUM ! $ % & ' ( ) * + , - . / : ; = ? @ _ ~
    3986 URIC_D2: ALNUM   $ % &         + , - . / : ; = ? @ _ ~

3986 D.2 doesn't add <reserved> like 1738 and 2396.  If I try to fix it
by adding <reserved> also in 3986 the result is more plausible:

    2396 URIC : ALNUM !   $ % & ' ( ) * + , - . / : ; = ? @     _ ~
    3986 URIC3: ALNUM ! # $ % & ' ( ) * + , - . / : ; = ? @ [ ] _ ~

1 - mark, with appendix D.2 I get:

    1738 UNRESERVED1: ALNUM ! $ ' ( ) * + , - . _
    2396 UNRESERVED2: ALNUM !   ' ( ) *     - . _ ~
    1738 SAFE_EXTRA: ! $ ' ( ) * + , - . _
    2396 MARK      : !   ' ( ) *     - . _ ~

In other words <mark> is the same as <unreserved> excluding <alphanum>.

    2396 UNRESERVED2: ALNUM ! ' ( ) * - . _ ~
    3986 UNRESERVED3: ALNUM           - . _ ~
    2396 MARK : ! ' ( ) * - . _ ~
    3986 MARK3: ! ' ( ) * - . _ ~

In 3986 D.2 it's the same old <mark>, no proper subset of <unreserved>.
IMHO it should be only "-", ".", "_", "~".

3 - nouric, determined indirectly as all VCHAR excl. the (fixed) <uric>:

    1738 UNSAFE      : " # % < > [ \ ] ^ ` { | } ~
    2396 DELIM_UNWISE: " # % < > [ \ ] ^ ` { | }
    3986 NOURIC3     : "     < >   \   ^ ` { | }

Is that correct ?  Is it an omission in appendix D.2 ?  Something like:

   | delims         | <"> / "<" / ">"                                  |
   | unwise         | "\" / "^" / "`" / "{" / "|" / "}"                |

Apparently (?) the complete set of excluded ASCII characters would be:
ugly = CTL / SP / DQUOTE / "<" / ">" / "\" / "^" / "`" / "{" / "|" / "}"

                           Bye, Frank
Received on Monday, 25 July 2005 04:15:42 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:09 UTC