Re: scheme-specific length limits (issue 48)

On Sun, Apr 3, 2011 at 5:48 AM, Larry Masinter <masinter@adobe.com> wrote:
> A scheme registration defines the syntax for URIs (IRIs) that are valid for the scheme.  A syntax definition can include limits -- that some strings are valid for the scheme and other strings are not. Those limits can be complicated, limit the repertoire of characters, be expressed in BNF, and can include length limits.
>
> Syntactic restrictions should be justified, usually by the limits of the resolution mechanism or protocol associated with a string. And we should disallow any limits (or any other syntactic restrictions) that treat %-hex encoded UTF8 characters differently than their unicode character equivalents.

That doesn't seem correct.  For example, the http scheme treats %-hex
encoded UTF8 characters differently than their unicode character
equivalents in some cases.  Consider:

http://example.com/foo?bar
http://example.com/foo%3Fbar

> document.body.innerHTML = "<a href='http://example.com/foo%3Fbar'>boo</a>"
> document.body.firstChild.pathname
"/foo%3Fbar"

> document.body.innerHTML = "<a href='http://example.com/foo?bar'>boo</a>"
> document.body.firstChild.pathname
"/foo"

Kind regards,
Adam

Received on Sunday, 3 April 2011 18:07:14 UTC