- From: Martin Duerst <duerst@it.aoyama.ac.jp>
- Date: Sat, 15 Mar 2008 08:40:35 +0900
- To: "Frank Ellermann" <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>, ietf-http-wg@w3.org
At 00:29 08/03/14, Frank Ellermann wrote: > >Brian Smith wrote: >> RFC 2277 applies to any updates to an existing protocol, as >> far as I can tell. > >It talks about UTF-8 "for all text". We can ask Harald what >that precisely means, my first guess is "SDU" (body), not the >complete "PDU" (header + body). It's all those protocol fields where you need human-readable text. The Subject: of an email very clearly qualifies as text, so it's not only body. >> I think that HTTPbis should explain how to encode UTF-8 >> text in newly registered header fields. The de-facto >> mechanism for this, used by Atom and WebDAV, is percent- >> encoded UTF-8. > >For a draft standard MIME RFC 2047 comes to mind, for a BCP >one of the two mechanisms recommended in BCP 113 (RFC 5137). >BCP 113 says that using UTF-8 is typically a bad idea when >looking for an ASCII-compatible encoding. I'm not hot about >what to use in 2616bis, if anything, but if it ends up in a >single case remotely requiring IDNAbis punycode I scream :-) > >For EAI they test the UTF-8 waters with *experimental* RFCs, >2616bis will be a draft standard or better if all goes well. >NAK, but FWIW Harald was the USEFOR WG Chair when what is >now RFC.usefor-usefor was Last Called and approved. He did >not insist on allowing UTF-8 in NetNews header fields, and >several years of the USEFOR WG were wasted to pull this >feature from an earlier set of Usefor drafts, because it >would break the complete installed base. That may have been the case for USEFOR. Can you show how allowing e.g. new HTTP header fields to use UTF-8 would break anything in the installed base? >Of course RFC.usefor-usefor has I18N considerations, and >you can use UTF-8 in NetNews roughly in the same way (MIME) >as in mail, or roughly as in HTTP (NetNews is 8bit clean): There is a big difference between MIME (ASCII+RFC2045) and HTTP (iso-8859-1+RFC2045). >But NOT "as is" in header fields (NetNews uses RFC 2231, a >successor of RFC 2047 also supporting language tags, for >this purpose). Another problem with percent-encoded UTF-8, >it offers no indication of language tags (BCP 47). I have yet to see a case where the absence of language information in a header is a problem in practice. Do you know of any? Regards, Martin. #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
Received on Monday, 17 March 2008 03:56:59 UTC