W3C home > Mailing lists > Public > uri@w3.org > July 2005

Re: STD 66 questions (problems ?)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Mon, 25 Jul 2005 03:25:11 -0700
Message-Id: <1d86c8c30f3ff14d7ad94e8863ab75e6@gbiv.com>
Cc: uri@w3.org
To: Frank Ellermann <nobody@xyzzy.claranet.de>

On Jul 24, 2005, at 9:12 PM, Frank Ellermann wrote:

> Hi, I've found two or three potential problems in RfC 3986 appendix 
> D.2,
> or I simply don't get the idea (still hunting those non-uric characters
> known as "unsafe" in RfC 1738, or as <delims> and <unwise> in RfC 
> 2396).

The "idea" is that, if you have an old specification that depends
on those no-longer-used terms within their own grammar, then a reader
should be able to use the terms in D.2 in combination with the
others in the standard part of the specification to figure out
what the grammar in the old spec means within the constraints of
the current URI syntax.

The idea is definitely not to retain the exact same values as
found in prior specs, since many of those values have been found
to be unsafe in practice in spite of the prior RFCs.

> 1 - uric, with appendix D.2 I get:
>
>     1738 XCHAR: ALNUM ! $ % & ' ( ) * + , - . / : ; = ? @ _
>     2396 URIC : ALNUM ! $ % & ' ( ) * + , - . / : ; = ? @ _ ~
>
>     2396 URIC   : ALNUM ! $ % & ' ( ) * + , - . / : ; = ? @ _ ~
>     3986 URIC_D2: ALNUM   $ % &         + , - . / : ; = ? @ _ ~
>
> 3986 D.2 doesn't add <reserved> like 1738 and 2396.  If I try to fix it
> by adding <reserved> also in 3986 the result is more plausible:
>
>     2396 URIC : ALNUM !   $ % & ' ( ) * + , - . / : ; = ? @     _ ~
>     3986 URIC3: ALNUM ! # $ % & ' ( ) * + , - . / : ; = ? @ [ ] _ ~

That would make prior grammars incorrect -- we would need to exclude
"#[]" to make it work.  The characters "!'()*" could be added to D.2.
Note that this only effects the interpretation of old grammars, not
the current syntax.

> 1 - mark, with appendix D.2 I get:
>
>     1738 UNRESERVED1: ALNUM ! $ ' ( ) * + , - . _
>     2396 UNRESERVED2: ALNUM !   ' ( ) *     - . _ ~
>     1738 SAFE_EXTRA: ! $ ' ( ) * + , - . _
>     2396 MARK      : !   ' ( ) *     - . _ ~
>
> In other words <mark> is the same as <unreserved> excluding <alphanum>.
>
>     2396 UNRESERVED2: ALNUM ! ' ( ) * - . _ ~
>     3986 UNRESERVED3: ALNUM           - . _ ~
>     2396 MARK : ! ' ( ) * - . _ ~
>     3986 MARK3: ! ' ( ) * - . _ ~
>
> In 3986 D.2 it's the same old <mark>, no proper subset of <unreserved>.
> IMHO it should be only "-", ".", "_", "~".

I would have to see the effect on grammars within specs dependent
on 2396 and 1738.

> 3 - nouric, determined indirectly as all VCHAR excl. the (fixed) 
> <uric>:
>
>     1738 UNSAFE      : " # % < > [ \ ] ^ ` { | } ~
>     2396 DELIM_UNWISE: " # % < > [ \ ] ^ ` { | }
>     3986 NOURIC3     : "     < >   \   ^ ` { | }
>
> Is that correct ?  Is it an omission in appendix D.2 ?  Something like:
>
>    | delims         | <"> / "<" / ">"                                  
> |
>    | unwise         | "\" / "^" / "`" / "{" / "|" / "}"                
> |
>
> Apparently (?) the complete set of excluded ASCII characters would be:
> ugly = CTL / SP / DQUOTE / "<" / ">" / "\" / "^" / "`" / "{" / "|" / 
> "}"

Those terms are not used by dependent specifications and therefore
do not need equivalents in D.2.

....Roy
Received on Monday, 25 July 2005 10:25:14 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:09 UTC