- From: Brian Smith <brian@briansmith.org>
- Date: Thu, 13 Mar 2008 06:22:28 -0700
- To: <ietf-http-wg@w3.org>
Frank Ellermann wrote: > Brian Smith wrote: > > >> Sounds like a good reason for not allowing link relations > >> that aren't URIs (or URI references). > > > That is against IETF policy. New standards have to allow the use of > > IRIs wherever URIs are allowed. At least, that is what I > > was told on the Atom mailing list. > New protocols are supposed to support minimally UTF-8 as per > RFC 2277. That is not the same as "support IRIs", IRIs can > be in any charset, not only UTF-8. Right, but I understand it to mean that a new protocol that supports URIs must support UTF-8 in URIs, and the only (proposed) standard for UTF-8 in URIs is RFC 3987. Also, instead of paraphrasing 2277, let's just directly quote the parts that apply to HTTPbis: "Lack of an ability to use UTF-8 is a violation of this policy; such a violation would need a variance procedure ([BCP9] section 9) with clear and solid justification in the protocol specification document before being entered into or advanced upon the standards track." "For existing protocols or protocols that move data from existing datastores, support of other charsets, or even using a default other than UTF-8, may be a requirement. This is acceptable, but UTF-8 support MUST be possible." "In documents that deal with internationalization issues at all, a synopsis of the approaches chosen for internationalization SHOULD be collected into a section called 'Internationalization considerations', and placed next to the Security Considerations section." > HTTP is no "new" protocol, like mail or news: 2821bis and > 2822upd and FWIW RFC.usefor-usefor don't "violate" any IETF > policy. But atom and xmpp were new, a different situation. RFC 2277 applies to any updates to an existing protocol, as far as I can tell. No, that is not what I was saying at all. > Please note that RFC 3987 is a proposed standard, HTTP is a > draft standard. You'd get a downref, and forcing servers and > clients worldwide to do the Unicode 3.2 punycode stunt (until > IDNAbis fixes it) is an interoperability nightmare. You don't need to do the IDNA transformation for link relations, because you are not resolving the hostname of the IRI of the link relation. > HTTP at the moment allows Latin-1, do you really want to > support the proper subset of all IRIs limited to Latin-1, for > the purpose of HTTP Link: header fields ? > When "keeping Latin-1" is a showstopper, then introducing > IRIs in 2616bis would be a clear "1F" scenario. You need a > new HTTP version number for this, a restart at PS, and a new > WG Charter. I am not suggesting that HTTP 1.1 should switch from Latin-1 to UTF-8. But, I do think that HTTPbis should explain how IRIs are to be used correctly in HTTP (via URI-IRI conversion, IDNA, etc.). And, I think that HTTPbis should explain how to encode UTF-8 text in newly registered header fields. The de-facto mechanism for this, used by Atom and WebDAV, is percent-encoded UTF-8. > > I personally believe it is wrong to create new standards > > where things may be named in European languages but not > in non-European languages. > > Strong ACK, let's drop the Latin-1 cruft, and limit 2616bis > to US-ASCII and URIs for now. HTTP/1.2 is free to tackle > UTF-8, and RFC 3987 offers a strict STD 66 URI for any IRI. You seem to know a lot more about IETF policy than me, but I don't see how it is possible to defer the internationalization considerations of HTTP any further, while keeping HTTP on the standards track. - Brian
Received on Thursday, 13 March 2008 13:23:07 UTC