- From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Date: Fri, 11 Sep 2009 16:35:20 +0900
- To: "Michael A. Puls II" <shadow2531@gmail.com>
- CC: public-iri@w3.org
Hello Michael, On 2009/09/11 7:24, Michael A. Puls II wrote: > On Thu, 10 Sep 2009 05:28:14 -0400, Martin J. Dürst > <duerst@it.aoyama.ac.jp> wrote: > >> Hello Michael, >> Many thanks for this example. I hope Anne can do some checks on the >> HTML5 side. I just tried your example in Opera 10, and it gave the >> UTF-8 based URI when I asked for 'copy link address'. I also clicked >> on the link and asked it to use my default MUA (Thunderbird with >> Eudora), and I got a draft email with legible text (Moskow at the >> start, and ITAR-TASS, that's about how much Russian I read). > > Thanks. > > Yes, I get that utf-8 behavior in Firefox and Safari also. I think > things are more interoperable that way. > > However, my concern is that HTML5 (well, the iri/uri spec additions for > HTML5) contradicts that and says to use the page's encoding instead. I > do not feel that is a good idea for some schemes. > > Also, for 'mailto:' links in web pages, I want to specifically avoid the > part before '?' and the part after '?' being resolved against a > different encoding. For mailto:, that would be undesirable and would > force authors to use "mailto:?to=value" instead of "mailto:value" so > that the to value is resolved against the same encoding as the other > values (like subject and body etc.). But, using mailto:?to= still isn't > supported as well as mailto:value, so that'd be bad too. > > For mailto links in html pages, I think the resolving should always be > (by default at least) utf-8 all the way through. (So that the .href > getter on a link and copy link address etc. all return something > utf-8-based regardless of the page's encoding). This is basically what > browsers do now. Just want to make sure the specs don't contradict that, > as browsers do it that way for a reason. I agree, and I haven't found anybody who disagrees yet. If that stays as it is, I'll make sure that the spec says what it should say on that point. Regards, Martin. > For mailto in HTML forms, I don't have too much preference as no one > uses it. > > I also think that for javascript:, it's probably best to always resolve > to percent-encoded utf-8 too. > > Also, if I remember correctly, it was desired that http(s) in HTML5 > pages be utf-8-only, but that wasn't possible for legacy reasons. I > don't think mailto: and some other schemes have that restraint. > > With that said, as Anne said, maybe using the page encoding should only > be a must for http(s) and that other protocols may ignore the page's > encoding and resolve to percent-encoded UTF-8. > > Now, if JS in browsers had an iconv() so that you can easily convert to > what you want and browsers had options to control the encoding, > per-protocol, for .href etc., per-site, then, maybe it wouldn't matter. > But, for now, just always using utf-8 for some schemes makes things > consistent and allows that expectation to be relied upon. > > Now, I'm not 100% sure what iri-bis/HTML5 says about this. It's really > low-level. which is why I'm asking for clarification (which Larry said > he'd respond when he gets a chance). > -- #-# Martin J. Dürst, Professor, Aoyama Gakuin University #-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
Received on Friday, 11 September 2009 07:36:31 UTC