[whatwg] [URL] Starting work on a URL spec from Adam Barth on 2010-07-24 (public-whatwg-archive@w3.org from July 2010)

From: Adam Barth <w3c@adambarth.com>
Date: Sat, 24 Jul 2010 09:55:55 -0700
Message-ID: <AANLkTi=7-oMG9g43cC0E0ee_TrOQf2L4ZBvE-Y=tDsUi@mail.gmail.com>

2010/7/23 Ian Fette (????????) <ifette at google.com>:
> http://code.google.com/apis/safebrowsing/developers_guide_v2.html#Canonicalization lists
> some interesting cases we've come across on the anti-phishing team in
> Google. To the extent you're concerned with / interested in
> canonicalizaiton, it may be worth taking a look at (not to suggest you
> follow that in determining how to parse/canonicalize URLs, but rather to
> make sure that you have some "correct" way of handling the listed URLs).

Thanks.  That's helpful.

> BTW, are you covering canonicalization?

Yes.  The three main things I'm hoping to cover are parsing,
canonicalization, and resolving relative URLs.

Adam


> On Fri, Jul 23, 2010 at 9:02 PM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
>> On 7/23/10 11:59 PM, Silvia Pfeiffer wrote:
>>> Is that URLs as values of attributes in HTML or is that URLs as pasted
>>> into the address bar? I believe their processing differs...
>>
>> It certainly does in Firefox (the latter have a lot more fixup done to
>> them, and there are also differences in terms of how character encodings are
>> handled).
>>
>> I would be particularly interested in data on this last, across different
>> browsers, operating systems, and locales...  There seem to be servers out
>> there expecting their URIs in UTF-8 and others expecting them in ISO-8859-1,
>> and it's not clear to me how to make things work with them all.
>>
>> -Boris
>
>

Received on Saturday, 24 July 2010 09:55:55 UTC