Re: PROPOSAL: i74: Encoding for non-ASCII headers from Mark Nottingham on 2008-04-04 (ietf-http-wg@w3.org from April to June 2008)

From: Mark Nottingham <mnot@mnot.net>
Date: Fri, 4 Apr 2008 12:27:15 +1100
To: Roy T. Fielding <fielding@gbiv.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <A00EF74A-8A56-4B35-A4CD-002A23D64CEE@mnot.net>
Roy,

I want to make progress -- whether on this issue or others doesn't  
matter much. If you have suggestions for doing so -- such as you've  
given below -- they're very welcome. In particular, if there are  
issues that you (or anyone else) think we'd profit from focusing on,  
I'd love to hear it; I've repeatedly asked for input on this, and  
haven't received much.

Using the wiki to collect experience is a good idea, if people are  
willing to participate. Anyone with a tools.ietf.org password* can  
create and modify pages on it; all that I ask is that you coordinate  
with me before modifying the front page.

WRT this issue -- as should be clear by recent traffic, I'm trying to  
carve off the parts of the issue that we can agree on now, so that we  
don't have to rehash the entire discussion in a few months. The parts  
that you contend aren't ready for discussion -- i.e., character  
encoding and C1 controls -- are in i71, which is the part we're  
deferring.


* See: http://www3.tools.ietf.org/newlogin


On 04/04/2008, at 12:09 PM, Roy T. Fielding wrote:

> On Apr 3, 2008, at 5:04 PM, Mark Nottingham wrote:
>> Roy, you often talk about things that will happen as part of the  
>> partitioning, but it's not clear what's involved. Do you have a  
>> roadmap of what you'd like to happen? If we're going to attempt  
>> substantial rewrites of different sections, we need to start that  
>> process soon.
>
> My roadmap is to make each part independent and complete.  It should  
> only
> affect the requirements that are vaguely specified as applying to HTTP
> as a whole (most notably in part 1 where the message is parsed, where
> connections are managed, and the spaghetti cross-references and  
> redundant
> requirements in caching).  Absolutely no protocol changes will occur  
> as
> a result -- only removal of overspecification.
>
>> Without some idea of that, it's going to be difficult to make any  
>> forward progress if we have to block any substantive issues on  
>> future rewrites that we may or may not do.
>
> This is only the second issue (out of 90+) that I have asked to be
> deferred.  Are we making progress right now by driving for consensus
> on an absolutely pointless and UNIMPLEMENTED issue?  Even if you
> derive an answer, it is inherently bogus and can't be published
> because we still must have two implementations of every requirement.
> That's why I keep asking for demonstrated need -- if we can't  
> demonstrate
> that a new requirement is needed, then there is no reason to discuss
> the text of a solution.  It is false progress and it prevents me from
> allocating my own time more wisely.
>
>>> The parsing algorithm will not say anything about C1 controls  
>>> because
>>> no known implementation of HTTP checks for C1 controls.
>>
>> That doesn't follow. If there are security or interoperability  
>> implications
>> of C1 controls in text, it certainly deserves consideration.
>
> There aren't any for HTTP.
>
>> Also, while the message parsing machinery doesn't touch TEXT, there  
>> are other parts of implementations that do -- e.g., command-line  
>> tools, Web forms for configuring servers and proxies, configuration  
>> files, and so forth. Just because it's payload doesn't mean that it  
>> doesn't have implementation impact.
>
> It does have implementation impact in the four fields that I  
> described,
> each of which are defined in a different location.  Two of those  
> locations
> (body and defined header fields) are specified already.  Reason  
> Phrase is
> solidly iso-8859-1 and already ignored.  That leaves extension-field,
> which can say whatever we like without impacting standard status  
> because
> it is specifically defined for extensions.  This will be obvious after
> the message parsing algorithm in Part 1 is properly specified and
> verified against current practice, which is why I asked that the issue
> be deferred.  I'll have time to work on it after April 13, when I get
> back from ApacheCon in Amsterdam where I am giving two presentations
> next week.
>
> Meanwhile, if you really want to make progress on *this* issue, then  
> the
> way forward is to start using the wiki to collect experience reports
> on what is and is not implemented in practice.  Further discussion  
> is *not*
> making progress.  In particular, ask the working group to help  
> demonstrate
> implementation practice instead of just offering opinions.  Encourage
> folks to try using SetHeader in Apache, its equivalent in IIS, Squid,
> and others.  Try doing the same with javascript or extensions in  
> browsers.
> We might find that raw UTF-8 is already possible for extension fields.
> Or it may be that OCTETs are the only things that matter and no  
> existing
> character encoding truly applies.  In any case, it is easier to find
> compliant implementations if we reduce the number of existing  
> requirements
> rather than add to them.
>
> ....Roy
>


--
Mark Nottingham     http://www.mnot.net/
Received on Friday, 4 April 2008 01:27:53 UTC