W3C home > Mailing lists > Public > public-html@w3.org > April 2009

Re: fragid navigation and pct-encoded

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Tue, 28 Apr 2009 15:15:49 -0700
Message-ID: <49F78015.4080607@mit.edu>
To: Ian Hickson <ian@hixie.ch>
CC: HTML WG <public-html@w3.org>
Ian Hickson wrote:
>>>    http://example.com/%
>> That might well not be intentional...
> 
> As far as I can tell it's interoperable amongst all the major browsers.

These things are not necessarily contradictory... but yeah.

> Following hyperlinks:
> http://www.whatwg.org/specs/web-apps/current-work/#following-hyperlinks

Aha.  This was teh part that I needed, and then the link to resolving a 
relative URI.

If I read this right, this requires spaces to end up in a the parsed url 
as-is (since they are added to the <unreserved> production), right?  Is 
there a good reason for this?

>> That said, there's one case I can think of offhand where the proposed 
>> algorithm has undesirable behavior.  Any time the browser is given a URI 
>> (not IRI)
> 
> Note that all URIs are IRIs.

Sure; the parenthetical above is probably just confusing and should be 
removed.

>> with a fragment (e.g. a Location HTTP header with a fragment), the only 
>> way to make that fragment match an id is to have the ID URI-escaped, and 
>> in particular have all non-ASCII characters URI-escaped.
> 
> Right.

Actually, I got that wrong; for an ID things are OK (you'd need to 
escape the fragment in the URI, but the ID itself can be unescaped). 
But for an <a name> the name would have to be escaped in the HTML.

>> Then that same ID is a pain to match from IRIs (they also end up needing 
>> to have those characters escaped).
> 
> Why?

Still talking about <a name>, the name in the HTML would be escaped so 
that it can be matched by URIs, and then the IRIs have to have the ref 
escaped as well, because no unescaping happens for names.

This is probably ok, especially because everything should "just work" 
for cases when IRIs are used end-to-end (not the case in Gecko right 
now, effectively, but I'm working on getting that changed).

-Boris
Received on Tuesday, 28 April 2009 22:16:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:34 GMT