Re: HTML5 - resolving href="mailto:" based on page's encoding or force utf-8?

Hello Michael,

Many thanks for this example. I hope Anne can do some checks on the 
HTML5 side. I just tried your example in Opera 10, and it gave the UTF-8 
based URI when I asked for 'copy link address'. I also clicked on the 
link and asked it to use my default MUA (Thunderbird with Eudora), and I 
got a draft email with legible text (Moskow at the start, and ITAR-TASS, 
that's about how much Russian I read).

It makes quite a bit of sense to limit the special processing for query 
parts (reencode back to the document encoding) to http/https. The reason 
for this special processing in the first place is that it is customary 
in Web forms (submitted with http/https) to use the encoding of the page 
the form is in for query parameters, and this custom was transferred to 
direct activation of links with query parts.

For actually submitting a form, what happens isn't part of the IRI or 
URI spec, but part of the preparation; if the form submission URI/IRI 
had a query part, it's either ignored or the data is inserted into the 
form fields (don't know which one actually applies), but either way, 
there is no need to reencode the data, it's just a matter of saying what 
bytes you send to the server from the form.

Regards,   Martin.

On 2009/09/05 20:04, Michael A. Puls II wrote:
> Attached is 1251.html. It's a Windows-1251 russian page. Load it and
> also see the source of it, please.
>
> In Firefox, Safari and Opera, I get the following for the resolved .href
> value of the link. (IE8 just shows the value as it is in the source)
>
> <mailto:?Subject=%D0%9C%D0%B0%D0%B9%D0%BE%D1%80%D1%83%20%D0%95%D0%B2%D1%81%D1%8E%D0%BA%D0%BE%D0%B2%D1%83%20%D0%BF%D1%80%D0%B5%D0%B4%D1%8A%D1%8F%D0%B2%D0%BB%D0%B5%D0%BD%D0%BE%20%D0%BE%D0%BA%D0%BE%D0%BD%D1%87%D0%B0%D1%82%D0%B5%D0%BB%D1%8C%D0%BD%D0%BE%D0%B5%20%D0%BE%D0%B1%D0%B2%D0%B8%D0%BD%D0%B5%D0%BD%D0%B8%D0%B5&Body=%D0%9C%D0%9E%D0%A1%D0%9A%D0%92%D0%90,%201%20%D1%81%D0%B5%D0%BD%D1%82%D1%8F%D0%B1%D1%80%D1%8F.%20%D0%9C%D0%B0%D0%B9%D0%BE%D1%80%D1%83%20%D0%94%D0%B5%D0%BD%D0%B8%D1%81%D1%83%20%D0%95%D0%B2%D1%81%D1%8E%D0%BA%D0%BE%D0%B2%D1%83,%20%D1%83%D1%81%D1%82%D1%80%D0%BE%D0%B8%D0%B2%D1%88%D0%B5%D0%BC%D1%83%2027%20%D0%B0%D0%BF%D1%80%D0%B5%D0%BB%D1%8F%20%D1%81%D1%82%D1%80%D0%B5%D0%BB%D1%8C%D0%B1%D1%83%20%D0%B2%20%D1%81%D1%83%D0%BF%D0%B5%D1%80%D0%BC%D0%B0%D1%80%D0%BA%D0%B5%D1%82%D0%B5%20%D0%BD%D0%B0%20%D1%8E%D0%B3%D0%B5%20%D0%9C%D0%BE%D1%81%D0%BA%D0%B2%D1%8B,%20%D0%BF%D1%80%D0%B5%D0%B4%D1%8A%D1%8F%D0%B2%D0%BB%D0%B5%D0%BD%D0%BE%20%D0%BE%D0%BA%D0%BE%D0%BD%D1%87%D0%B0%D1%82%D0%B5%D0
%BB%D1%8C%D0%BD%D0%BE%D0%B5%20%D0%BE%D0%B1%D0%B2%D0%B8%D0%BD%D0%B5%D0%BD%D0%B8%D0%B5,%20%D0%BF%D0%B5%D1%80%D0%B5%D0%B4%D0%B0%D0%B5%D1%82%20%D0%98%D0%A2%D0%90%D0%A0-%D0%A2%D0%90%D0%A1%D0%A1.%20%0D%0A%D0%9F%D0%BE%D0%BB%D0%BD%D0%B0%D1%8F%20%D0%B2%D0%B5%D1%80%D1%81%D0%B8%D1%8F%20%D1%81%D1%82%D0%B0%D1%82%D1%8C%D0%B8%20%D0%BD%D0%B0>
>
>
> However, should I instead be getting:
>
> <mailto:?Subject=%CC%E0%E9%EE%F0%F3%20%C5%E2%F1%FE%EA%EE%E2%F3%20%EF%F0%E5%E4%FA%FF%E2%EB%E5%ED%EE%20%EE%EA%EE%ED%F7%E0%F2%E5%EB%FC%ED%EE%E5%20%EE%E1%E2%E8%ED%E5%ED%E8%E5&Body=%CC%CE%D1%CA%C2%C0,%201%20%F1%E5%ED%F2%FF%E1%F0%FF.%20%CC%E0%E9%EE%F0%F3%20%C4%E5%ED%E8%F1%F3%20%C5%E2%F1%FE%EA%EE%E2%F3,%20%F3%F1%F2%F0%EE%E8%E2%F8%E5%EC%F3%2027%20%E0%EF%F0%E5%EB%FF%20%F1%F2%F0%E5%EB%FC%E1%F3%20%E2%20%F1%F3%EF%E5%F0%EC%E0%F0%EA%E5%F2%E5%20%ED%E0%20%FE%E3%E5%20%CC%EE%F1%EA%E2%FB,%20%EF%F0%E5%E4%FA%FF%E2%EB%E5%ED%EE%20%EE%EA%EE%ED%F7%E0%F2%E5%EB%FC%ED%EE%E5%20%EE%E1%E2%E8%ED%E5%ED%E8%E5,%20%EF%E5%F0%E5%E4%E0%E5%F2%20%C8%D2%C0%D0-%D2%C0%D1%D1.%20%0D%0A%CF%EE%EB%ED%E0%FF%20%E2%E5%F0%F1%E8%FF%20%F1%F2%E0%F2%FC%E8%20%ED%E0>
>
>
> in browsers, according to HTML5/web addresses/iri-bis? (That's what I'd
> get in browsers if it was an http link instead)
>
> The reason I ask is that for 'http', browsers use the document's charset
> to resolve the link into a URI, but with 'mailto', they always(by
> default at least) force UTF-8 in this case (which makes things a lot
> easier for passing the data to webmails and other mail clients, which
> usually want percent-encoded utf-8 to decode).
>
> What do HTML5/web addresses/iri-bis say about this exactly? Do they
> allow Firefox, Opera and Safari to do what they do, or do they say that
> the resolving is like http for all protocols?
>
> Or, is this undefined and the browse does what it wants?
>
> Thanks
>

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp

Received on Thursday, 10 September 2009 09:29:18 UTC