- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Tue, 28 Apr 2009 15:15:49 -0700
- To: Ian Hickson <ian@hixie.ch>
- CC: HTML WG <public-html@w3.org>
Ian Hickson wrote: >>> http://example.com/% >> That might well not be intentional... > > As far as I can tell it's interoperable amongst all the major browsers. These things are not necessarily contradictory... but yeah. > Following hyperlinks: > http://www.whatwg.org/specs/web-apps/current-work/#following-hyperlinks Aha. This was teh part that I needed, and then the link to resolving a relative URI. If I read this right, this requires spaces to end up in a the parsed url as-is (since they are added to the <unreserved> production), right? Is there a good reason for this? >> That said, there's one case I can think of offhand where the proposed >> algorithm has undesirable behavior. Any time the browser is given a URI >> (not IRI) > > Note that all URIs are IRIs. Sure; the parenthetical above is probably just confusing and should be removed. >> with a fragment (e.g. a Location HTTP header with a fragment), the only >> way to make that fragment match an id is to have the ID URI-escaped, and >> in particular have all non-ASCII characters URI-escaped. > > Right. Actually, I got that wrong; for an ID things are OK (you'd need to escape the fragment in the URI, but the ID itself can be unescaped). But for an <a name> the name would have to be escaped in the HTML. >> Then that same ID is a pain to match from IRIs (they also end up needing >> to have those characters escaped). > > Why? Still talking about <a name>, the name in the HTML would be escaped so that it can be matched by URIs, and then the IRIs have to have the ref escaped as well, because no unescaping happens for names. This is probably ok, especially because everything should "just work" for cases when IRIs are used end-to-end (not the case in Gecko right now, effectively, but I'm working on getting that changed). -Boris
Received on Tuesday, 28 April 2009 22:16:53 UTC