Re: Problems with draft-ietf-http-v11-spec-07 from Klaus Weide on 1996-08-19 (ietf-http-wg@w3.org from July to September 1996)

From: Klaus Weide <kweide@tezcat.com>
Date: Mon, 19 Aug 1996 11:18:38 -0500 (CDT)
To: "Roy T. Fielding" <fielding@liege.ICS.UCI.EDU>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <Pine.SUN.3.95.960819074656.4634F-100000@xochi.tezcat.com>
On Sat, 17 Aug 1996, Roy T. Fielding wrote:

[me, kw:] 
> > I have to disagree that Section 2.2 (or the draft as a whole) is clear
> > about character sets.  What is included in parentheses in the quote 
> > above may be the intention, but it is not explicit.
> 
[politics and the history of charset specificity avoidance...]

[... lots snipped ]
> > I conclude that the draft is far from clear.
> 
> Well, look at how the BNF defines messages -- the only fields that are
> defined in terms of <field-content> are the extension fields (those not
> defined by the specification itself).

True, if you strictly follow the BNF only.

But you hardly want to say that the whole of 14.2, including the
#(values) rule whose textual description refers directly to the
BNF-defined message-header, applies to extension headers only.

> Saying that the draft is "far from clear" is not useful.  Do you have
> specific wording that could be added, and where it should be added, which
> would help clarify the issue without harming the protocol's extensibility?

I'll try...

Replace
       field-content  = <the OCTETs making up the field-value
                        and consisting of either *TEXT or combinations
                        of token, tspecials, and quoted-string>
with
       field-content  = <the OCTETs making up the field-value,
                        either as defined by more specific rules
                        (see 14, 19.6.2) or *TEXT for unrecognized
                        message headers>
Reason:
The former is unnecessary specific; in fact it seems exactly equivalent
to
       field-content  = *(*TEXT | *(token | tspecial | quoted-string) )
which is apparently not the intention.

Also add after the first paragraph of 3.2, before 3.2.1:

  Whitespace is not allowed anywhere within an URI.  In other words,
  the last paragraph of 2.1 (implied *LWS) does not apply to the BNF
  rules in 3.2.1 and 3.2.2.

In 3.11 Entity Tags, add a sentence somewhere 'Whitspace is not
allowed between the "W/" prefix and the opaque-tag' (if you think
it should apply).

In 9.2 GET, make the last sentence ("If the OPTIONS request passes
through a proxy,...") a separate paragraph, to make clear that it
applies to both preceding pragraphs (which handle "*" and not-"*",
respectively).

11 Access Authentication, 4th paragraph:
A user agent that wishes to authenticate itself with a server--usually,
but not necessarily, after receiving a 401 or 411 response--MAY do so by
                                              ^^^
The 411 seems to be in error here.

> > Some other (mostly BNF related) weirdnesses:
> > 
> > SInce URI is comprised of tokens (see (a) above), the following
> > seems to apply:
> 
> I'll pass on the rest, since the assumption is false.
> 
> > Comments (within parentheses) should probably allowed in more
> > places - at least, in 19.4.7
> >        MIME-Version   = "MIME-Version" ":" 1*DIGIT "." 1*DIGIT
> > should probably be
> >        MIME-Version   = "MIME-Version" ":" 1*DIGIT "." 1*DIGIT *comment
> 
> Why? A comment in that location serves no useful purpose.

Because I have seen headers like that in mail and news messages.
Apparently several MUAs/mime converters/emacs packages are creating 
them.  If this is included in the HTTP spec at all -- since it has no
significance to the HTTP protocol, why restrict the syntax unnecessarily.
There's even an explicit note in RFC1521 that the following are 
equivalent:
                    MIME-Version: 1.0
                    MIME-Version: 1.0 (Generated by GBD-killer 3.7)

Another place where trailing comments could be allowed would be the 
HTTP From header.  But I don't know how common that is in HTTP/1.0.
 

> > 10.3.6 305 Use Proxy
> > 
> > The requested resource MUST be accessed through the proxy given by the
> > Location field. The Location field gives the URL of the proxy. The
> > recipient is expected to repeat the request via the proxy.
> > 
> > How exactly does is a proxy "given" by a Location field?
> > Location normally contains an URI, and URIs point to resources but
> > not (normally) applications (the proxy).  Does the URI have to be
> > a http_URL, does the abs_path have to be empty (or is it required
> > to be "/"), and what if not?
> 
> On the contrary, proxies are normally identified by URL.  The URL
> does not need to be an http URL (though it would be in current practice)
> and the interpretation of the path (if any) would be dependent on
> the method of proxying (http would not use any path).

I don't understand what you mean with "normally".  I could not find
any indication in the draft that proxies are identified by an URL.  If
I have e.g. CERN httpd 3.0 running on my.dom.ain:80 with proxying
enabled, then "http://my.dom.ain/" (with or without the trailing "/")
would refer to that server's home page, not to "an intermediary
program which acts as both a server and a client".

I understand that you want to keep the spec open for future extensions,
but since this draft defines (a version of) HTTP, it should at least
define the use of this header with http_URLs.  Otherwise this response 
code will be useless (different implementations use it differently,
or nobody uses it), and it could just as well be defined as
"This code is reserved for future use" (like 402).

I suggest something like
   The requested resource MUST be accessed through the proxy given by the
   Location field.  If the Location field consists of a http_URL, it
   MUST NOT contain an abs_path, and the recipient is expected to repeat 
   the request via the proxy given by the host and, if present, port 
   in the http_URL.

   If the Location field contains an URL which is not an http_URL,
   it indicates the proxy in a way not defined by this protocol.
 
   If for some reason the recipient cannot repeat the request via the
   proxy given in the response, this MUST be treated as an error. 

However, on closer inspection, this doesn't make much sense either...
or is not nearly enough.  Some questions that are completely open:
Is a proxy allowed to act on this, or does it have to forward the
response, or are both allowed (dependent on configuration of the proxy)?  
Can the request, when repeated, be forwarded through another proxy,
or is direct connectivity required?

Also consider that this response is incompatible with clients which
don't understand it, since the location header here has a different
meaning from other cases.  Not even a compliant HTTP/1.1 client is
required to understand that 305 is basically different from other
3xx codes:  6.1.1: "HTTP applications are not required to
understand the meaning of all registered status codes, though such
understanding is obviously desirable."[1]  If such a client treats
a 305 response as equivalent to 300 (as recommended in 6.1.1), 
it may redirect the request to the proxy machine's home page -
not at all what was intended.

It would be cleaner not to overload "Location" in this way and
define a new "Proxy-Location" header instead.

[1] I don't understand why the spec is so lenient here.  Shouldn't
all HTTP/1.1 implementations be REQUIRED to recognize all the 
response codes in this spec?  There are several "client MUST"s
in other parts for specific response codes, and it's not clear 
whether "not understanding" a response code lets a client get away
with not following its MUSTs.  This includes 305.

> Please keep in mind that the spec does not prevent people from doing
> things that won't work -- it doesn't have to.

Of course it's not possible to absolutely prevent people from such
a thing.  But it should be possible to say "If you want to do X,
then better do it *this* way, because then it will also work with
other peope's products who follow the same spec".

> >                           *   *   *
> > 
> > A remark regarding 14.1 Accept:
> > It's a pity there is a "q=", but not a "mxb=".  An oversight or
> > intentional?
> 
> The conneg group decided it "wasn't needed" based on the observation
> that browsers didn't implement it.

There is a way in the latest lynx code to specify it via the .mailcap
file, and I understand that the Apache server can make use of it.

> Koen is wrong in that the presence
> of Range does absolutely nothing to replace the functionality of mxb.
> The only problem with mxb is that it adds complexity to the process
> of configuring a browser and there is no convenient way to adjust the
> maximum based on the purpose of an individual request.

I would like to be able to use it as a general maximum for all requests,
as in "Do not *ever* send my something bigger than...".

> Given the
> lack of enthusiasm about Accept, and the growing complexity of content
> negotiation in general, there was not enough reason to restore it to
> the specification once it was removed.

Okay, I guess it's not forbidden to use it...

  Klaus
Received on Monday, 19 August 1996 09:22:51 UTC