Brian Smith wrote: >> New protocols are supposed to support minimally UTF-8 as per >> RFC 2277. That is not the same as "support IRIs", IRIs can >> be in any charset, not only UTF-8. > > Right, but I understand it to mean that a new protocol that supports > URIs must support UTF-8 in URIs, and the only (proposed) standard for > UTF-8 in URIs is RFC 3987. Well, no. URIs are all ASCII. There's is no such thing as "Unicode support in URIs". These are called "IRIs". > Also, instead of paraphrasing 2277, let's just directly quote the parts > that apply to HTTPbis: > > "Lack of an ability to use UTF-8 is a violation of this policy; such a > violation would need a variance procedure ([BCP9] section 9) with clear > and solid justification in the protocol specification document before > being entered into or advanced upon the standards track." > > "For existing protocols or protocols that move data from existing > datastores, support of other charsets, or even using a default other > than UTF-8, may be a requirement. This is acceptable, but UTF-8 support > MUST be possible." > > "In documents that deal with internationalization issues at all, a > synopsis of the approaches chosen for internationalization SHOULD be > collected into a section called 'Internationalization considerations', > and placed next to the Security Considerations section." All nice in theory, but it hasn't been done in RFC2616. >> HTTP is no "new" protocol, like mail or news: 2821bis and >> 2822upd and FWIW RFC.usefor-usefor don't "violate" any IETF >> policy. But atom and xmpp were new, a different situation. > > RFC 2277 applies to any updates to an existing protocol, as far as I can > tell. I don't see how it could apply to that. > ... >> HTTP at the moment allows Latin-1, do you really want to >> support the proper subset of all IRIs limited to Latin-1, for >> the purpose of HTTP Link: header fields ? > >> When "keeping Latin-1" is a showstopper, then introducing >> IRIs in 2616bis would be a clear "1F" scenario. You need a >> new HTTP version number for this, a restart at PS, and a new >> WG Charter. > > I am not suggesting that HTTP 1.1 should switch from Latin-1 to UTF-8. > But, I do think that HTTPbis should explain how IRIs are to be used > correctly in HTTP (via URI-IRI conversion, IDNA, etc.). And, I think HTTP uses URIs, not IRIs. That being said, it may be necessary to state a few things about HTTP IRIs, but that would be in a separate document. Remember our charter? > that HTTPbis should explain how to encode UTF-8 text in newly registered > header fields. The de-facto mechanism for this, used by Atom and WebDAV, > is percent-encoded UTF-8. Note: one instance in WebDAV Delta-V, one in AtomPub. Are you saying httpbis should recommend that for new headers? I'm not against it, but it sounds like something for an update to the HTTP header registry. >>> I personally believe it is wrong to create new standards >>> where things may be named in European languages but not >> in non-European languages. >> >> Strong ACK, let's drop the Latin-1 cruft, and limit 2616bis >> to US-ASCII and URIs for now. HTTP/1.2 is free to tackle >> UTF-8, and RFC 3987 offers a strict STD 66 URI for any IRI. > > You seem to know a lot more about IETF policy than me, but I don't see > how it is possible to defer the internationalization considerations of > HTTP any further, while keeping HTTP on the standards track. I fear it's the other way around. If we want to keep it on the standards track, we can't make any incompatible changes. BR, JulianReceived on Thursday, 13 March 2008 14:15:29 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 6 June 2008 08:04:35 GMT