[whatwg] [URL] Starting work on a URL spec from Maciej Stachowiak on 2010-07-25 (public-whatwg-archive@w3.org from July 2010)

From: Maciej Stachowiak <mjs@apple.com>
Date: Sat, 24 Jul 2010 22:19:17 -0700
Message-ID: <083933CF-BEA0-4DE7-91CC-F1E41593F7EA@apple.com>

On Jul 24, 2010, at 9:55 AM, Adam Barth wrote:

> 2010/7/23 Ian Fette (????????) <ifette at google.com>:
>> http://code.google.com/apis/safebrowsing/developers_guide_v2.html#Canonicalization lists
>> some interesting cases we've come across on the anti-phishing team in
>> Google. To the extent you're concerned with / interested in
>> canonicalizaiton, it may be worth taking a look at (not to suggest you
>> follow that in determining how to parse/canonicalize URLs, but rather to
>> make sure that you have some "correct" way of handling the listed URLs).
> 
> Thanks.  That's helpful.
> 
>> BTW, are you covering canonicalization?
> 
> Yes.  The three main things I'm hoping to cover are parsing,
> canonicalization, and resolving relative URLs.

Is there any place in the Web platform where "canonicalize" is exposed by itself in a Web-facing way? I think resolve against a base and parse into components are the only algorithms whose effects can be observed directly. I think we only need to spec "canonicalize" if it turns out to be a useful subroutine.

There's also the related question of what browsers should do with input typed into the URL field. Other than establishing that these rules may be different between the URL field and URLs present in content, I'm not sure this is amenable to spec. But perhaps a survey of what browsers do would be useful.

Regards,
Maciej

Received on Saturday, 24 July 2010 22:19:17 UTC