W3C home > Mailing lists > Public > whatwg@whatwg.org > July 2010

[whatwg] [URL] Starting work on a URL spec

From: Ian Fette <ifette@google.com>
Date: Fri, 23 Jul 2010 21:15:12 -0700
Message-ID: <AANLkTi=88AtQTJroZUuC5ihX5jqOuj5RL4nop7Cm5eSr@mail.gmail.com>
http://code.google.com/apis/safebrowsing/developers_guide_v2.html#Canonicalization
lists
some interesting cases we've come across on the anti-phishing team in
Google. To the extent you're concerned with / interested in
canonicalizaiton, it may be worth taking a look at (not to suggest you
follow that in determining how to parse/canonicalize URLs, but rather to
make sure that you have some "correct" way of handling the listed URLs).

BTW, are you covering canonicalization?

-Ian

On Fri, Jul 23, 2010 at 9:02 PM, Boris Zbarsky <bzbarsky at mit.edu> wrote:

> On 7/23/10 11:59 PM, Silvia Pfeiffer wrote:
>
>> Is that URLs as values of attributes in HTML or is that URLs as pasted
>> into the address bar? I believe their processing differs...
>>
>
> It certainly does in Firefox (the latter have a lot more fixup done to
> them, and there are also differences in terms of how character encodings are
> handled).
>
> I would be particularly interested in data on this last, across different
> browsers, operating systems, and locales...  There seem to be servers out
> there expecting their URIs in UTF-8 and others expecting them in ISO-8859-1,
> and it's not clear to me how to make things work with them all.
>
> -Boris
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20100723/66308637/attachment.htm>
Received on Friday, 23 July 2010 21:15:12 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:25 UTC