- From: Adam Barth <ietf@adambarth.com>
- Date: Sun, 3 Apr 2011 13:27:43 -0700
- To: Julian Reschke <julian.reschke@gmx.de>
- Cc: Larry Masinter <masinter@adobe.com>, Noah Mendelsohn <nrm@arcanedomain.com>, Martin J. Dürst <duerst@it.aoyama.ac.jp>, Ted Hardie <ted.ietf@gmail.com>, Tony Hansen <tony@att.com>, "public-iri@w3.org" <public-iri@w3.org>
On Sun, Apr 3, 2011 at 1:05 PM, Julian Reschke <julian.reschke@gmx.de> wrote: > On 03.04.2011 20:06, Adam Barth wrote: >> On Sun, Apr 3, 2011 at 5:48 AM, Larry Masinter<masinter@adobe.com> wrote: >>> A scheme registration defines the syntax for URIs (IRIs) that are valid >>> for the scheme. A syntax definition can include limits -- that some strings >>> are valid for the scheme and other strings are not. Those limits can be >>> complicated, limit the repertoire of characters, be expressed in BNF, and >>> can include length limits. >>> >>> Syntactic restrictions should be justified, usually by the limits of the >>> resolution mechanism or protocol associated with a string. And we should >>> disallow any limits (or any other syntactic restrictions) that treat %-hex >>> encoded UTF8 characters differently than their unicode character >>> equivalents. >> >> That doesn't seem correct. For example, the http scheme treats %-hex >> encoded UTF8 characters differently than their unicode character >> equivalents in some cases. Consider: >> >> http://example.com/foo?bar >> http://example.com/foo%3Fbar >> >>> document.body.innerHTML = "<a >>> href='http://example.com/foo%3Fbar'>boo</a>" >>> document.body.firstChild.pathname >> >> "/foo%3Fbar" >> >>> document.body.innerHTML = "<a href='http://example.com/foo?bar'>boo</a>" >>> document.body.firstChild.pathname >> >> "/foo" >> ... > > No news. "?" is special in URI parsing, thus it needs to be escaped when > it's not meant to start a query component. Yeah, I'm not saying that behavior is surprising. I'm saying that Larry's requirement is violated even for very commonly used schemes. Adam
Received on Sunday, 3 April 2011 20:28:52 UTC