- From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Date: Mon, 31 Oct 2011 17:29:46 +0900
- To: "Mykyta Yevstifeyev (М. Євстіфеєв)" <evnikita2@gmail.com>
- CC: public-iri@w3.org
Hello Mykyta,
On 2011/10/30 13:52, "Mykyta Yevstifeyev (М. Євстіфеєв)" wrote:
> 29.10.2011 21:57, iri issue tracker wrote:
>> #104: Characters are still excluded from URIs
>>
>> From Frank Ellermann<hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
>>
>> * In section 6 please replace one string of weasel words:
>> "These characters originally have been excluded from
>> URIs". These characters still are excluded from URIs,
>> i.e., STD 66 does not list them anywhere as permitted
>
> STD 66 does not list them as disallowed in <gen-delims> or <sub-delims>;
> please compare:
>
>> gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
>> sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
>> / "*" / "+" / "," / ";" / "="
>
> and
>
>> Unwise characters "\" (U+005C), "^" (U+005E), "`" (U+0060), "{"
>> (U+007B), "|" (U+007C), and "}" (U+007D)
>
> But RFC 3986 doesn't list them as allowed in <unreserved> and
> <sub-delims> (the latter is allowed in some other productions):
>
>> unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
>> sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
>> / "*" / "+" / "," / ";" / "="
>
> So an interesting situation: the character isn't allowed and isn't
> disallowed.
Sorry, but that's a complete non-sequitur. URIs are defined by the ABNF;
there's no way to read the ABNF that includes any of the "unwise"
characters. This means that they are plain and simply disallowed.
RFC 3987 has a "MAY" for dealing with them when converting from IRI to
URI, but this doesn't mean that they are allowed in IRIs. Looking back
now I see the paragraph with this "MAY" as a predecessor of our HTML(5)
handling guidelines.
Regards, Martin.
> I suppose not being disallowed means being allowed, and not
> being allowed and not being disallowed means nothing. So here the
> sentence in 3987bis is right.
>
> The authors of RFC 3987 probably referred to RFC 1738 <national>
> production:
>
>> national = "{" | "}" | "|" | "\" | "^" | "~" | "[" | "]" | "`"
>
> Now these chars are allowed in URIs, as I've explained above.
>
> Mykyta Yevstifeyev
>
>
>
Received on Monday, 31 October 2011 08:30:27 UTC