Re: Comments (Part 2) on HTTP I-D Rev 05 from Jim Gettys on 1998-11-16 (ietf-http-wg@w3.org from October to December 1998)

From: Jim Gettys <jg@pa.dec.com>
Date: Mon, 16 Nov 1998 11:40:53 -0800
To: "Adams, Glenn" <gadams@spyglass.com>
Cc: http-wg@hplb.hpl.hp.com
Message-Id: <9811161940.AA17038@pachyderm.pa.dec.com>
Oops, I hit "send" long before I was done by mistake.

> From: "Adams, Glenn" <gadams@spyglass.com>
> Date: Mon, 2 Nov 1998 19:42:17 -0600
> To: "'Masinter, Larry'" <masinter@parc.xeroc.com>,
>         "'Getty, James'"
>          <jg@pa.dec.com>
> Cc: "'WG - HTTP'" <http-wg@hplb.hpl.hp.com>
> Subject: Comments (Part 2) on HTTP I-D Rev 05
> -----
> Alternative 1 --
>   This text
> 
> Following is my second set of comments, covering sections 12 through
> 14.9.3. Part
> three will follow presently.
> 
> 52. Section 12 uses the phrase "agent-driven negotiation"; suggest
> adding a note explaining that "agent" refers to "user agent".

It isn't necessarily a user agent.

> 
> 53. Section 12.2, pg. 68, 1st para., has "(this specification reserves
> the field-name Alternates)", however this field name is not described
> in section 14 as a reserved field name nor otherwise elaborated
> elsewhere in this specification.

Yes, this is left to the content negotiation people to define.
It is inappropriate to do more than reserve it, so that others
don't usurp the header name.  It should be "header-name" however.

> 
> 54. Throughout the whole of section 13 it is often unclear as to
> whether a requirement or statement is meant to apply only to a proxy
> cache, to a user agent cache, or to both. For example, in 13.1.1, the
> 6th paragraph has "If the cache can not [sic] communicate with the
> origin server ..." which appears to apply to either a proxy or a user
> agent cache. On the other hand, the 7th paragraph has "If a cache
> receives a response (...) that it would normally forward to the
> requesting client ..." which appears to apply to a proxy cache only.
> Suggest editing the entirety of section 13 to clarify applicability to
> these different caches.

"cache" is the general term.  There can be client or proxy caches,
and if it applies to both, the general term is used.  Saying
"proxy and client cache" all the time would just get confusing,
verbose and painful.  And there can be caches at the server.
And there are braindead network caches to boot.  The generic term
is best when we are talking bout caching systems, or someone
will think a requirement doesn't apply to them, and get the wrong
answer.

We also don't want to get caught up in terminological problems that can 
occur if a proxy is on the client system.  Web caching systems also often 
get implemented as pretty separate parts of browsers, rather than all 
entangled, and so such requirements can apply to client caches (or caches 
on clients) as well, to illustrate the slippery slope, potentially.

If you want to point out particular places you think there are problems,
we can go into things further, but I'm not going to entertain 
"editing the entirety of section 13" at this very, very, late date.

> 
> 55. Section 13, pg. 69, 3rd para., has "the protocol requires that
> transparency be relaxed ... only ... only ..."; suggest changing to use
> the keyword phrase "MUST NOT relax transparency unless" to make clear
> the requirement.
> 

This is explanatory text rather than the requirement itself
and it isn't appropriate for there to be a normative reference.

> 56. Section 13.1.1, pg. 70, 1st para., has "A correct cache MUST
> respond with the most up-to-date response ..."; since caching is always
> optional, this would read better as "A cache MUST NOT respond with a
> response held by the cache that is appropriate to the request (...)
> unless it meets one of the following conditions:".

This is an improvement?

> 
> 57. Section 13.1.1, pg. 70, numbered item (3) implies that 4XX and 5XX
> responses are cachable. While this is true for 410, under what
> circumstances should any 5XX response be cached?

If they were explicitly marked as cachable.  I don't think
that is very likely given the class of error, but might happen.

> 
> 58. Section 13.1.2, 4th para., has "whether the Warning MUST or MUST
> NOT be deleted ..." which does not state a requirement per se: use
> "is or is not to be deleted ...".

See the ROSS20 issue.

> 
> 59. Section 13.1.2, 5th para., has "Warnings in responses that are
> passed to HTTP/1.0 caches carry an extra warning-date field, which
> prevents a future HTTP/1.1 recipient from believing an erroneously
> cached Warning."  I can't interpret this statement based on information
> in this section.  Please explain it further and state as a requirement
> if indeed it is a requirement.

It is explanation: the actual requirements and full explanation are stated 
in the last paragraph of 14.46.  I'll move the last sentence of 13.1.2 
(the cross reference to 14.46) up toward the beginning of the section 
so that it is clearer that this discussion is to be read together with 
14.46.

> 
> 60. Section 13.1.2, 7th para., has "a server might provide the same
> warning with texts in both English and Basque". How would a UA
> discrimitate among different warnings using different languages unless
> the language were explicitly marked? Unfortunately, RFC2047 does not
> address this issue. I'd suggest permitting the extensions specified by
> RFC2231 (which updates RFC2047) to be used to provide explicit language
> tagging of quoted strings.


> 
> 61. Section 13.1.3, 2nd para., has "if there is any apparent conflict
> between header values, the most restrictive interpretation is applied
> ...".  Change "is applied" to "MUST" or "SHOULD" "be applied".

No, I think this can't be made normative...  It doesn't lay out what
the conflicts might be, what "most restrictive" might be, and is
a general rule; it serves as guidance.

> 
> 62. Section 13.1.4, 1st para., change "the user agent might allow ..."
> to "the user agent MAY allow ...".

These are all examples in explanatory text. Ergo the "might" and it
is inappropriate to word them that way, as they are hypothetical cases,
rather than explicit protocol elements.

> 
> 63. Section 13.1.4, 2nd para., has "the user agent SHOULD explicitly
> indicate to the user ..." while section 13.1.5, 1st para., has "This
> allows the client software to alert the user". This later statement
> appears to make the indication/alert optional rather than recommended
> as implied by the former.
> 

Yup.  Thanks...

I'll reword the last two sentences to be:

  "Whenever a cache returns a stale response, it MUST mark it as such (using
  a Warning header), enabling the client software to alert the user
  that there might be a potential problem."


> 64. Section 13.2.3, 3rd para., has "HTTP/1.1 requires origin servers to
> send a Date header, if possible, with every response ..." seems to be
> stating a conditional imperative. Rather than paraphrasing section
> 14.18 and possibly confusing the requirements regarding Date header
> transmission, I'd suggest rephrasing this to simply refer to a Date
> header, if present, and to state what must be done in the case that a
> Date header is not present.

I don't see that this hurts: the statement is true, with the wiggle room
to allow for clockless servers, as laid out in 14.18.  And the cross reference
is there.

> 
> 65. Section 13.2.3, pg. 76, 1st para. after pseudo code block, has "the
> server MUST"; suggest changing to "the proxy server MUST".

Actually, I think it should be "the cache MUST", as you might have
a client cache performing the computation, not explicitly a proxy
server.

Note that one could in principle have a cache built in to the origin server 
(some people call these "HTTP accelerators") and the same rule should 
apply to these caches.

> 
> 66. Section 13.3, pg. 78, 1st para., suggest changing "it first has to
> check" with "it will normally check" to make this read more as a
> statement of fact than a requirement.

No, the requirements stated elsewhere are pretty strong requirements. 
They are mandantory requirements, with a wiggle or two for island caches. 
Your rewording makes it sound like those requirements are not as strong 
as they are.

> 
> 67. Section 13.3.3, pg. 81, next to last paragraph, has "A cache or
> origin server ...". Change to "A caching proxy or origin server ...".

No, you can have a local cache which has to make the same computation.
The requirement is on them as well.

> 
> 68. Section 13.3.4, pg. 83, 2nd para., starts "A note on rationale:
> ..."  Suggest changing this to a standard note form, i.e., use "Note:"
> prefix with indented block paragraph.
> 

It is already a block paragraph.  I suppose I can just say Note:.

> 69. Section 13.4, pg. 83, 2nd para., starts "Note that ...". Suggest
> changing this to a standard note form, i.e., use "Note:" prefix with
> indented block paragraph. Further, this note has "some HTTP/1.0 caches
> are known to violate this expectation without providing any Warning."
> What warning should be provided in this case?

Ok, Note: it is. 

The expectation is the one in the first paragraph. A 1.1 cache (or a 1.0 
cache implementing Warning) would give you a 1xx Warning, depending on 
the circumstances (little or no network connectivity).  But many/most
1.0 caches don't implement Warning, so you won't see a warning from them.

> 
> 70. Section 13.4, pg. 83, 3rd para., has "so that the server can
> indicate that certain resource entities, or portions thereof, MUST NOT
> be cached ..." which does not appear to state a specific requirement
> per se. Suggest changing to "... are not to be cached ...".
> 

Yup.  You are right.  I'll fix.

> 71. Section 13.4, pg. 84, 1st para., starts "Note that ...". Suggest
> changing this to a standard note form.

Ok, will do.

> 
> 72. Section 13.4, pg. 84, 3rd para.: suggest giving status codes 302
> and 307 as examples of responses which are not cachable by default but
> which may be explicitly marked as cachable by using Expires or the
> "public" cache-control directive.

I'll add an: "(e.g. status codes 302 and 307)" to the sentence.

> 
> 73. Section 13.5, 1st para., remove comma in "to requests, for use ...".

Yup.  I'll fix.

> 
> 74. Section 13.5.1, pg. 84, 1st para, 1st bullet, has "End-to-end
> headers which MUST be ...". The use of MUST and SHOULD keywords in
> relative clauses is problematic and should be avoided since it does not
> state a requirement per se. Most such instances can be replaced by some
> form of "is" to express simple fact; e.g., "End-to-end headers which
> are ...".

OK, I'll delete the ", which".  But the MUST is necessary in this case
as it is a requirement that all end-to-end headers are transmitted.


> 
> 75. Section 13.5.1, pg. 84, 4th para., has "Hop-by-hop headers
> introduced in future versions of HTTP MUST be listed in a Connection
> header ..."  stipulates a requirement on the authors and/or
> implementors of future versions of HTTP, and not on implementors of
> HTTP/1.1. Suggest rephrasing this appropriately.
> 

You can introduce a hop-by-hop header in HTTP/1.1 as well, by
using Connection.

I'll rephrase to be:

"Other hop-by-hop headers MUST be listed in a Connection header (section 
14.10) to be introduced into HTTP/1.1 (or later).

> 76. Section 13.5.2, pg. 85, 4th para., has " ... of the following
> fields in message that ..." which needs an article "a" before "message".

Yup.

> 
> 77. Section 13.5.2, pg. 85, 5th para., has "if not already present, it
> MUST add a Warning 214 (...) if one does not already appear ..." which
> uses redundant language about "already present"/"already appear[ing]".

Oh, another one for the department of redundancy bureau...

> 
> 78. Section 13.5.3, pg. 86, 5th para., has "all such old headers are
> replaced." which sounds like a requirement: "... MUST be replaced."

I think the sentence you complains about should be pulled up to the
paragraph starting "Unless".

    "Unless the cache decides to remove the cache entry, it MUST also
    replace the end-to-end headers stored with the cache entry with
    coresponding headers recived in the incoming response.  If a header
    field-name in the incoming response matches more than one header in
    the cache entry, all such old headers MUST be replaced.

I also suspect that changing the "are" to a "MUST be" makes
it clearer that this is a requirement.

> 
> 79. Section 13.6, pg. 87, 3rd para., has "When the cache receives a
> subsequent request whose Request-URI specifies one or more cache
> entries including a Vary header field, ...". Suggest changing "one or
> more cache entries including a Vary header field" to "one or more cache
> entries of previous responses which contained a Vary header field".
> Further, it appears that this paragraph implies a caching proxy
> context, but it is not clear that this would not also apply to a user
> agent cache. The next paragraph (end of pg. 87 and beginning of pg. 88)
> appears to be framed as applying only to a proxy. Again it isn't clear
> that this does not apply to a UA cache.

If the document just says "cache", it applies to all caches, not
just proxies.  Vary certainly applies to client caches as well.
We're not going to make the text more complicated (complicated enough
as it is) by saying "proxy and user agent caches all over".

> 
> 80. Section 13.8, pg. 88, 1st para., implies the context of a caching
> proxy, requiring a 206 response when forwarding a partial response. In
> the case of a user agent cache that receives and wishes to use a
> partial response, it would seem that a Warning should be added by the
> UA cache to the response it generatese for internal UA consumption.
> However, there appears to be no Warning code that would serve this
> purpose.

Again, this applies to all caches.  There is no implication that
it only applies to a proxy.

You can do what you want in the user agent, including passing back
other information (and there are obvious hacks possible of internal
warnings you might generate).

> 
> 81. Section 13.11, 1st para., has "All methods that might be expected
> to cause modifications to the origin server's resources MUST be written
> through to the origin server. This currently includes all methods
> except for GET and HEAD." It would be better here to specify the
> methods that must be written through explicitly: PUT and DELETE. It
> isn't clear that OPTIONS, POST, or TRACE fall in this category; and
> then there's CONNECT, which doesn't fit into either of the above
> groups of methods.

No, HTTP is extensible, so we can't enumerate methods here. I
don't know how it could be any more clear than 'all methods except
GET and HEAD".  We don't define the semantic meaning of CONNECT in
our document.  Certainly, OPTIONS, POST and TRACE all have to be
written through.

> 
> 82. Section 14, pg. 91: suggest adding a sentence to each header
> defined by this section that states whether the header is end-to-end or
> hop-by-hop and whether the header is cachable by default, cachable by
> explicit cache directive, or never cachable.

No.  Duplicates with exisiting text in section 13.  Responses are
cachable, not headers, in any case.

> 
> 83. Section 14.1, pg. 92, 6th para., has "the most specific reference
> has precedence"; suggest using "SHOULD" or "MUST" "take precedence".

No.  There is already a SHOULD above that says what to do.

> 
> 84. Section 14.2, pg. 93, 3rd para., is quite confusing: suggest
> rewriting without using the term "mentioned". Also, this para. seems to
> be stating that if any "iso-8859-1;q=1" is always implied unless
> otherwise explicitly present. This means that:
> 
>     Accept-Charset: iso-8859-5, unicode-1-1;q=0.9
> 
> really means
> 
>     Accept-Charset: iso-8859-1;q=1, iso-8859-5;q=1, unicode-1-1;q=0.9
> 
> (in which case 8859-1 would be given equal billing with 8859-5). And
> that consequently the only way to exclude 8859-1 is to specify
> 
>     Accept-Charset: iso-8859-1;q=0, iso-8859-5, unicode-1-1;q=0.9
> 
> Is this the intended usage? If so, I find this not only convoluted but
> seriously sub-optimal. This emphasis on 8859-1 as default really is too
> much. Why go so far overboard?

Independent thread of discussion started.

> 
> 84. Section 14.2, pg. 93, 4th para., has "the server SHOULD send an
> error response with the 406 (not acceptable) status code, though the
> sending of an unacceptable response is also allowed." The effect of the
> final clause of this statement is to downgrade SHOULD to MAY.  Either
> remove the final clause or change to MAY. [My preference is to remove
> the final clause.]  Note that the semantics stated here are expressly
> different from Accept and Accept-Encoding which do require 406
> responses for unconditionally compliant implementations.  This
> inconsistency will make it difficult or impossible to implement
> agent-driven content negotiation based on Accept-Charset variants.

Independent thread of discussion started.

> 
> 85. Section 14.3, pg. 94, 1st para., has "A server tests whether ...";
> suggest changing to "A server MUST test ...".

Hart to say MUST, when the most that can be said is SHOULD on what
the server will do.  (MUST is reserved in IETF parlance for what is
required for the protocol to function.  I like it better as is.

> 
> 86. Section 14.3, pg. 94, last para., has "This means that qvalues will
> not work and are not permitted with x-gzip or x-compress." This appears
> to be stating a compatibility requirement, in which case MUST or SHOULD
> is better (in which case this note would have to be made normative).

No, the note is there to allow implementers to do things that
maximize interoperability with the installed base.  Identity was
introduced to solve a existing problem in that installed base,
and so is not out there.  We're providing guidance for implementers to
minimize problems with this installed base.  But there will be problems,
no matter how hard we try.

Life is hard, and then you die; HTTP was buggy in this area.

> 
> 87. Section 14.4, pg. 95, next to last para., has "recommended" and
> "MUST NOT" in a note. Either make this normative (not a note) and use
> SHOULD and MUST NOT consistently or use "ought" and "ought not",
> respectively.

I'll put it inline just to reduce confusion.

> 
> 88. Section 14.4 does not contain language as found in other Accept-*
> headers that recommends a 406 response in the case the server cannot
> satisfy the request based on its variant set for the specified URI.
> This precludes implementing client-side content negotiation along this
> variance axis. Suggest adding the required language or a note
> indicating why it is not present and what this means for client-side
> negotiation.

Independent thread of discussion started.

> 
> 89. Section 14.5. I find "Accept-Ranges" to be inconsistent in its name
> and usage with other Accept-* headers. This really should have been
> handled with Expect. Unless you can change this to use Expect (and I
> doubt you can at this stage), I'd suggest adding a note to indicate
> this inconsistency. I'd also urge adding a mechanism which does use
> Expect and deprecates the use of Accept-Ranges in a future version of
> HTTP. [If "Allow-Ranges" had been used instead, at least this would be
> consistent with the "Allow" header usage.]
> 

We all know the name is a bug.  But it is too late to change.  Too late
to use expect for it.  And I don't think it is worth introducing into
expect in the future, since it is an optional field, mostly documented
for the sake of existing practice.


> Regarding the actual use of Accept-Ranges, which response would be
> appropriate for a server which sends "Accept-Ranges: none" in response
> to a Range request?  406? Some mention of the appropriate response
> should be made here.

No, it is advisory (a hint from the server to "please don't bother to ask,
as I can't).

> 
> 90. Section 14.8, pg. 97-98, seems to imply a caching proxy when
> referring to "shared cache"; however, this seems to apply as well to a
> shared cache on a user agent. Suggest making it clear which kinds of
> caches are addressed by these paragraphs.

It is a cache shared between users, as defined in 13.7, to which
it has a correct cross reference.  And yes, it could be a shared
cache in a client machine); if it shared between users, it is a shared
cache.  If it is private to an individual user, it is a private cache.

Caching proxies aren't necessarily shared. (though likely are).

If we didn't make appropriate distinctions, we'd get into real trouble.  
You have to keep the concepts separate, or you'll get into trouble on 
privacy grounds, and/or return documents to the wrong people.

> 
> 91. Throughout the whole of section 14.9, it is often unclear as to
> whether a requirement or statement is meant to apply only to a proxy
> cache, to a user agent cache, or to both.

Caches are caches.  The requirements apply to caches.  If we say
cache, you can presume a user agent cache is included...

> 
> 92. Section 14.9, pg. 98, 1st para., has "that MUST be obeyed by all
> caching mechanisms" which does not specify a requirement per se (note
> use of MUST in relative clause).

This is saying that the Cache-Control header is mandantory to implement 
in HTTP/1.1, everywhere along the chain (server, proxy, client caches, 
etc.).  Note that a simplistic cache implementation might choose to not 
cache under alot of circumstances that it can concievably cache, but we 
have to demand it be implemented at least at the minimal level.  Otherwise,
it won't be of any use.

Ergo, you MUST obey the directives.


> 
> 93. Section 14.9, pg. 99, 1st para., has "When a directive appears
> without any 1#field-name parameter, the directive applies to the entire
> request or response." At the present, no cache-request-directive
> employs a 1#field-name parameter (see pg. 98); consequently all request
> directives apply to the entire request in all cases.
> 

The text already makes it clear that this is an extensibility hook; see
14.9.6 for an example.

> 94. Section 14.9.1, pg. 99, 2nd para., has "Indicates that the response
> is cachable by any cache, ...". Suggest changing "is cachable" to "MAY
> be cached".
> 
Ok, I'll change to "MAY be cached".

> 95. Section 14.9.1, pg. 99, last para., has keyword MUST NOT in
> resultative clause "Indicates that ..."; suggest rephrasing as
> imperative, e.g., "A response containing the cache directive 'private'
> MUST NOT be cached by a shared cache.".

I can't get excited about this rewording...  And it drops the intent phrase.
No.

> 
> 96. Section 14.9.1, pg. 100, 2nd para. under "no-cache", has "the
> specified field-name(s) MUST NOT be sent in the response to a
> subsequent request without successful revalidation with the origin
> server." followed by "This allows an origin server to prevent the
> re-use of certain header fields in a response, while still allowing
> caching of the rest of the response." I find this rather confusing. My
> reading of this is that a cache can, indeed, retain and reuse a field
> specified in a no-cache directive as long as it revalidates the entry
> with the origin server.  Furthermore, it appears that "no-cache" with
> no field name is to be interpreted identically to must-revalidate. I
> have always interpreted no-cache without a field-name to mean don't
> cache the response under any circumstances. Which is it?

Again, you are confused by the extensibility hook.  We define the default 
behavior for the directive (not to cache), and allow extensions to relax 
that behavior (possibly cache, depending upon these future extensions).

> 
> 97. Section 14.9.2, pg. 100, 1st para., needs to do a better job of
> defining "store" without using keywords in the term or definition.
> Also, it would be better to place this definition at the beginning of
> this section. I would suggest a rewrite as follows:
> 
> "The purpose of the no-store directive is to prevent the inadvertent
> release or retention of sensitive information. In the present context,
> 'store' means to intentionally retain data in non-volatile storage.
> The no-store directive applies to the entire message, and MAY be sent
> either in a response or in a request. If sent in a request, a cache
> MUST NOT store any part of the request or its response. If sent in a
> response, a cache MUST NOT store any part of either the response or the
> request that elicitied it. This directive applies to both shared and
> non-shared caches."
> 
> In this rewrite, I've removed the statement about removing data from
> volatile storage, since this doesn't seem to apply to the semantics of
> this directive.  In particular, many caches employ a volatile and a
> non-volatile component.  The described semantics appears to strictly
> address use of the non-volatile component, and not the volatile
> component. If this is not the case, then these semantics would appear
> to broaden the stated intention found in the last paragraph on pg.
> 100. Furthermore, this directive applies equally to a user agent cache
> as well as a caching proxy, so language aimed at proxies (e.g., "after
> forwarding it") appears to be overly narrow in scope.

I don't like your rewrite; for example, it drops the requirement to
eliminate the copy in a volatile cache as soon as you've forwarded the
message in a proxy.  Maybe you'd like your medical records in someone's
crash dump, but I don't.

This privacy issue is more slippery than it appears on the surface.  I'm
not about to try to rewrite it at this late date.

Again, caches are caches; if you are a user agent, and build a cache,
you are subject to these requirement.

> 
> 98. Section 14.9.3, pg. 101, 2nd para., has "the max-age directive
> overrides the Expires header".  What is the imperative status of this
> statement? Should this read "MUST override"?

I think it is fine as written.  It is just stating what to do when
both are present and you understand both (use max-age).

> 
> 99. Section 14.9.3, pg. 101, 5th para. (s-maxage), has "i.e., that the
> shared cache MUST NOT ..." which doesn't state a requirement per se.
> Rewrite to avoid using MUST NOT in an explanatory relative clause
> (particularly one using "i.e.").

I guess you're right.  MUST NOT here should be must not.

> 
> 100. Section 14.9.3, pg. 101, 5th para. (s-maxage), has "The s-maxage
> directive is always ignored by a private cache". Should this read "A
> private cache MUST ignore the s-maxage directive."?
> 

I think this is ok the way it is.

> 101. Section 14.9.3, pg. 102, 7th para., has "only if this does not
> conflict with any MUST-level requirements"; suggest rephrasing as "with
> any conditionally compliant requirements" to avoid using MUST in this
> context which is not stating a requirement per se.
> 

I'll put the MUST in quotes.
			- Jim
Received on Monday, 16 November 1998 11:46:51 UTC