- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Mon, 25 Jul 2005 03:25:11 -0700
- To: Frank Ellermann <nobody@xyzzy.claranet.de>
- Cc: uri@w3.org
On Jul 24, 2005, at 9:12 PM, Frank Ellermann wrote: > Hi, I've found two or three potential problems in RfC 3986 appendix > D.2, > or I simply don't get the idea (still hunting those non-uric characters > known as "unsafe" in RfC 1738, or as <delims> and <unwise> in RfC > 2396). The "idea" is that, if you have an old specification that depends on those no-longer-used terms within their own grammar, then a reader should be able to use the terms in D.2 in combination with the others in the standard part of the specification to figure out what the grammar in the old spec means within the constraints of the current URI syntax. The idea is definitely not to retain the exact same values as found in prior specs, since many of those values have been found to be unsafe in practice in spite of the prior RFCs. > 1 - uric, with appendix D.2 I get: > > 1738 XCHAR: ALNUM ! $ % & ' ( ) * + , - . / : ; = ? @ _ > 2396 URIC : ALNUM ! $ % & ' ( ) * + , - . / : ; = ? @ _ ~ > > 2396 URIC : ALNUM ! $ % & ' ( ) * + , - . / : ; = ? @ _ ~ > 3986 URIC_D2: ALNUM $ % & + , - . / : ; = ? @ _ ~ > > 3986 D.2 doesn't add <reserved> like 1738 and 2396. If I try to fix it > by adding <reserved> also in 3986 the result is more plausible: > > 2396 URIC : ALNUM ! $ % & ' ( ) * + , - . / : ; = ? @ _ ~ > 3986 URIC3: ALNUM ! # $ % & ' ( ) * + , - . / : ; = ? @ [ ] _ ~ That would make prior grammars incorrect -- we would need to exclude "#[]" to make it work. The characters "!'()*" could be added to D.2. Note that this only effects the interpretation of old grammars, not the current syntax. > 1 - mark, with appendix D.2 I get: > > 1738 UNRESERVED1: ALNUM ! $ ' ( ) * + , - . _ > 2396 UNRESERVED2: ALNUM ! ' ( ) * - . _ ~ > 1738 SAFE_EXTRA: ! $ ' ( ) * + , - . _ > 2396 MARK : ! ' ( ) * - . _ ~ > > In other words <mark> is the same as <unreserved> excluding <alphanum>. > > 2396 UNRESERVED2: ALNUM ! ' ( ) * - . _ ~ > 3986 UNRESERVED3: ALNUM - . _ ~ > 2396 MARK : ! ' ( ) * - . _ ~ > 3986 MARK3: ! ' ( ) * - . _ ~ > > In 3986 D.2 it's the same old <mark>, no proper subset of <unreserved>. > IMHO it should be only "-", ".", "_", "~". I would have to see the effect on grammars within specs dependent on 2396 and 1738. > 3 - nouric, determined indirectly as all VCHAR excl. the (fixed) > <uric>: > > 1738 UNSAFE : " # % < > [ \ ] ^ ` { | } ~ > 2396 DELIM_UNWISE: " # % < > [ \ ] ^ ` { | } > 3986 NOURIC3 : " < > \ ^ ` { | } > > Is that correct ? Is it an omission in appendix D.2 ? Something like: > > | delims | <"> / "<" / ">" > | > | unwise | "\" / "^" / "`" / "{" / "|" / "}" > | > > Apparently (?) the complete set of excluded ASCII characters would be: > ugly = CTL / SP / DQUOTE / "<" / ">" / "\" / "^" / "`" / "{" / "|" / > "}" Those terms are not used by dependent specifications and therefore do not need equivalents in D.2. ....Roy
Received on Monday, 25 July 2005 10:25:14 UTC