Re: PROPOSAL: i74: Encoding for non-ASCII headers

On 29/03/2008, at 3:54 PM, Roy T. Fielding wrote:

> I'd rather defer this issue until much later in the process,


I know people are getting fatigued by this one, and I'd like to at  
least get a shared understanding of the scope of this issue, and maybe  
carve off some parts to make it more manageable. Here are a few  
statements that I believe capture where we're at; if you disagree,  
please say so (hopefully without reopening the entire discussion);

1. We are considering allowing UTF-8 in content, specifically (a) in  
newly defined headers, and/or (b) in places where TEXT is now.

2. We intend to remove the "blanket" RFC2047 encoding associated with  
TEXT and (if kept) move it to the definitions of the individual rules,  
so that it's clear where such encoding may occur. Candidates for this  
include Reason-Phrase, filename-parm, warn-text, as well as the  
comments in field-content.

3. If RFC2047 encoding is used / referenced, we need to more carefully  
specify its use; e.g., regarding what encoding forms are allowable,  
line length limits, charsets used, folding.

4. From also deserves a look.

5. Either the definition of TEXT or CTL may need the C1 control  
characters added.

As Roy states, #1 should be approached conservatively. I think we can  
quickly get a decision on 2, 3, and 5. If we later decide to allow  
UTF-8, we can readjust the text (and TEXT) to suit, but if we're going  
to punt on this, I'd like to at least nail down the parts of it that  
we know we have to deal with.


--
Mark Nottingham     http://www.mnot.net/

Received on Wednesday, 2 April 2008 05:55:42 UTC