- From: Brian Smith <brian@briansmith.org>
- Date: Thu, 13 Mar 2008 08:00:36 -0700
- To: <ietf-http-wg@w3.org>
Julian Reschke wrote: > Brian Smith wrote: > > "Lack of an ability to use UTF-8 is a violation of this > > policy; such a violation would need a variance procedure > > ([BCP9] section 9) with clear and solid justification > > in the protocol specification document before being > > entered into or advanced upon the standards track." > > > > "For existing protocols or protocols that move data > > from existing datastores, support of other charsets, > > or even using a default other than UTF-8, may be a > > requirement. This is acceptable, but UTF-8 support > > MUST be possible." > > All nice in theory, but it hasn't been done in RFC2616. The purpose of HTTPbis is to fix problems with RFC2616. That is one of the problems that needs to be fixed. > >> HTTP is no "new" protocol, like mail or news: 2821bis and 2822upd > >> and FWIW RFC.usefor-usefor don't "violate" any IETF > >> policy. But atom and xmpp were new, a different situation. > > > > RFC 2277 applies to any updates to an existing protocol, as > > far as I can tell. > > I don't see how it could apply to that. Please read what I quoted above. HTTP is an existing protocol, so it can have a default charset other than UTF-8, but "UTF-8 support MUST be possible." > > I am not suggesting that HTTP 1.1 should switch from > > Latin-1 to UTF-8. But, I do think that HTTPbis should > > explain how IRIs are to be used correctly in HTTP > > (via URI-IRI conversion, IDNA, etc.). And, I think > > HTTP uses URIs, not IRIs. > > That being said, it may be necessary to state a few things > about HTTP IRIs, but that would be in a separate document. > Remember our charter? How does RFC 2277 fit into the standardization process. RFC 2277 itself says "This document is the current policies being applied by the Internet Engineering Steering Group (IESG) towards the standardization efforts in the Internet Engineering Task Force (IETF) in order to help Internet protocols fulfill these requirements." I take that to mean that the IESG will reject any specification that is not compliant with RFC 2277 as a matter of policy. > > that HTTPbis should explain how to encode UTF-8 text in newly > > registered header fields. The de-facto mechanism for this, used by > > Atom and WebDAV, is percent-encoded UTF-8. > > Note: one instance in WebDAV Delta-V, one in AtomPub. > > Are you saying httpbis should recommend that for new headers? > I'm not against it, but it sounds like something for an > update to the HTTP header registry. HTTPbis should at least standardize a mechanism for new headers to support Unicode text. Percent-encoded UTF-8 is one possibility. Or--just thinking off the top of my head--HTTPbis could allow new headers to encode UTF-8 text directly in quoted-strings, by starting the quoted string with the BOM (<EF><BB><BF>, which is "" in Latin-1). But, it is totally unacceptable to add the Link header with a non-Unicode-capable title subfield, it is unacceptable to specify any new headers that have any human-oriented text that is not Unicode enabled, and any existing headers that have human-oriented text should be revised (in the most backwards-compatible way possible) to support Unicode text. > > You seem to know a lot more about IETF policy than me, but > > I don't see how it is possible to defer the > > internationalization considerations of HTTP any further, > > while keeping HTTP on the standards track. > > I fear it's the other way around. If we want to keep it on > the standards track, we can't make any incompatible changes. I agree. That is why I have not suggested any incompatible changes. Regards, Brian
Received on Thursday, 13 March 2008 15:01:15 UTC