Re: Round 3: moving HTTP 1.0 to informational from Larry Masinter on 1996-02-09 (ietf-http-wg@w3.org from January to March 1996)

From: Larry Masinter <masinter@parc.xerox.com>
Date: Fri, 9 Feb 1996 00:53:59 PST
To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <96Feb9.005404pst.2733@golden.parc.xerox.com>
I concur with all of Roy's proposed changes except for the following
discussion items:

draft:
>> This specification reflects the approximate state of those features which
>> are found in most HTTP/1.0 implementations. The specification is split into
>> two sections. Those features of HTTP for which implementations are usually
>> consistent are described in the main body of this document. Those features
>> which have few implementations or inconsistent ones are listed in Appendix
>> D.

Roy:
> I still object to "the approximate state of" -- it isn't needed and is not
> accurate.  Also, this should be a replacement for the last two sentences of 
> the first paragragh of section 1.1, not a separate paragraph.

How about:

# This specification describes those features that seem to be
# consistently implemented in most HTTP/1.0 clients and servers.

This removes the word 'approximate', and substitutes the requirement
that the feature be 'found' to a more appropriate constraint of
'consistent implementation', and restricts the domain to clients and
servers (e.g., omitting proxies.)

================================================================
Roy:
> We should also add:
>                               Recipients must ignore any media type
> parameters whose names they do not recognize.

Could you explain what you mean by "ignore"?  If a recipient merely
stores the entity and then regurgitates it later, for example, it
should not discard media type parameters that it does not recognize.
On the other hand, if you mean 'do not process' by 'ignore' then
perhaps you want another word?  "Leave unmolested", "must not modify"?
Aren't we just better off not adding this?

================================================================
Roy:

>   In addition, if the text media is represented in a character
>   set which does not use octets 13 and 10 for CR and LF respectively, as
>   is the case for some multi-byte character sets, HTTP allows the use
>   of whatever octet sequences are defined by that character set to
>   represent the equivalent of CR and LF for line breaks.  It is
>   assumed that any recipient capable of using such a character set
>   will know the appropriate octet sequence for representing line
>   breaks within that character set. 

which is contentious and does not represent current practice, as far
as I can see. I've found sites that do UTF-8, Shift-JIS, EUC, etc.
but have yet to find a site that does UCS-2; I've found a browser that
does UCS-2 but it hardly represents a feature that is consistently
implemented.

While I think this is an important point to deal with, I'd like to see
the HTTP/1.0 draft proceed without trying to untie this particular
knot. So, I would like to leave this out.

================================================================
draft:
> Media types of "text/*" are defined to have a default charset parameter of
> "US-ASCII", and that other charset parameters should be labelled. In
> practice, HTTP servers frequently send text data without a charset
> parameter, and expect clients to guess the character set of the result.
> This has caused a great deal of confusion and lack of interoperability in
> HTTP 1.0 clients and servers.

Roy:
> This is incorrect and not representative of current practice OR recommended
> practice.

I will stand by the assertion that as far as I can tell, the first two
sentences correctly describe current practice. I'm not sure about
"great deal" in the third sentence, though. I agree completely that it
is not recommended practice.  I'm not sure what (else) you mean by
"this is incorrect". I suppose that the wording leaves out the
recommendation that "ISO-8859-1" is a good first guess, and that as
such your revised wording gives more advice.

Roy:
>   The "charset" parameter is used with some media types to define the
>   character set (Section 3.4) of the data.  When no explicit charset
>   parameter is provided by the sender, media subtypes of the "text"
>   subtype are defined to have a default charset value of "ISO-8859-1"
>   when received via HTTP.  

>      Note: Some HTTP user agents provide a configuration option to
>      allow the user to change the default interpretation of the media
>      type character set when no charset parameter is given.  However,
>      use of such options is not consistent and leads to poor
>      interoperability across open systems.

Even though they're defined to have this default charset value,
current practice is that most servers just send what they have. The
use of the client options isn't what leads to poor interoperability;
the clients were just trying to cope with the inconsistent servers!

If you want to try to wordsmith this, be sure to write what you think
current practice is as well as recommended practice. Personally, I
still prefer it the way it was, with perhaps an addition that it is
recommended that servers supply ISO-8859-1 and that clients 'guess'
that format first, even though many other character encodings seem to
be used.
================================================================
Roy:
> That should be "Implementors of HTTP origin servers should ..."

If you have a server that's both a proxy and an origin,
should you not also restrict it? If you have a server that meant to be
a proxy but can also be used as an origin etc. etc.?

================================================================
draft:
> example, Unix, Microsoft Windows, and other operating systems use ".."

Roy:
> Ummm, unless we want to include the TM disclaimer, that should be
> "example, some operating systems". [It is okay by me to include the disclaimer]

I've scanned current RFCs for instances of Unix and Windows and found
several without TM disclaimers. Why do you believe this is necessary?
================================================================
>> In RFC 1521, the header fields in multipart body-parts are generally
>> ignored

> ... unless the field-name begins with "Content-".

Could we just say 'most header fields' instead of 'the header fields'?
Received on Friday, 9 February 1996 00:55:43 UTC