- From: Adrian Custer <avc.httpbis@gmail.com>
- Date: Wed, 05 Sep 2012 13:13:20 -0400
- To: HTTPbis <ietf-http-wg@w3.org>
Hello everyone,
This mail contains some initial editorial comments, questions, and
recommendations for the httpbis draft
(http://tools.ietf.org/wg/httpbis/), version 20.
This work arose from a desire to develop a set of conformance tests for
our web services establishing them as HTTP/1.1 conformant 'origin
servers.' The analysis was derived from ignoring the bulk of the
standard and focusing only on the formal injunctions, the sentences with
the words 'MUST' or 'SHOULD,' with a focus on those affecting origin
servers.
While I had hoped to go through the whole document, I have stopped at
section 3.3 of part 1. The review of the remaining injunctions will
proceed in much the same way as for those presented here, asking similar
questions of each one: is the target clearly identified, is the target
one in the list given in section 2.6, is the injunction clear, if the
injunction uses "SHOULD" is the conditionality of when an implementation
can avoid the injunction clear, do we understand the consequences of
violating the injunction, and so on. So the authors and editors, if they
find the review worthwhile, should be able to make their own progress on
the rest of the document.
In reading your draft, two central questions arose that might
fundamentally alter this analysis depending on your response.
First, are 'targets' intended to be the only entities enjoined by the
requirements of your standard?
These are introduced, in section 2.6 in part 1 and in section 1.1 of
part 2, apparently to define the entities for which rules will be made.
However, the rules have not all been written to explicitly constrain
instances of those targets. The formal adoption of a constrained set of
elements would be very useful for readers of your standard, however,
first, the list will probably need to be updated after formal review of
all the injunctions and, second, the rules will need to be rewritten to
focus explicitly on these targets only.
Second, are the requirements (aka 'injunctions' or 'rules'), that is the
sentences with the capitalized 'MUST' and 'SHOULD,' intended to be the
only source of normative text or are they merely intended to complement
and stress certain aspects of the text?
The role of your requirements should be clear to you as authors and to
us as readers. For instance, I have yet to find any requirement which
demands that 'recipients of inbound messages' respond to each message;
the protocol is described as a request/response protocol (vis. sec 2.1,
part 1) but this pattern is never mandated in a formal requirement from
what I can tell. From a systematic point of view, this seems to be the
most fundamental requirement so its absence as a formal injunction
suggests your standard treats injunctions merely as stressing some parts
of the protocol. (This is proposed as an example; any actual requirement
would surely have to be qualified at least for pipelined requests since
an error early in the pipelined stream, say a GET request declaring a
body of a certain length without actually having any body, might prevent
a recipient from being able to distinguish each incoming request as a
separate request.)
The rest of this analysis proceeds assuming that you intend 'targets' to
be the only entities constrained and that you intend the sentences with
'MUST' and 'SHOULD' to be the only normative text. The Open Geospatial
Consortium (OGC) (http://www.opengeospatial.org) recently adopted
similar rules for its own work on specification documents along with two
other principles: requirements should be visually isolated from the rest
of the text and requirements should be testable and therefore
accompanied by a formally written set of tests. The former could be
adopted by HTTPbis with little effort. The latter principle, however,
despite making conformance testing much simpler, would have a major
impact on your work and is probably better left for the future.
EDITORIAL ISSUES:
----------------
* The requirements should adopt the boring but clear structure:
$target MUST action
or
$target SHOULD, $condition, action
in the active form to avoid making injunctions of things that are not
targets and to avoid using the passive voice and thereby failing to
define any target at all.
* Could the lower case 'must' in the 'Copyright section' be changed to
some other phrasing so as to keep that word for formal injunctions?
* Is section 2, part 1 merely a description of the architecture of the
HTTP protocol or the beginning of the formal requirements?
Section 2 is primarily descriptive, introducing the terminology of the
standard and the targets of the requirements. However, section 2 also
introduces a few requirements which are not the overarching, fundamental
requirements of the standard (each request gets a response, all messages
must follow the syntax, inbound messages must be requests and outbound
messages must be responses) but are highly specific issues. It would be
best to move the specific issues elsewhere in the document. If would
also be best to ensure the 'targets' are defined before any requirement
is made of those targets, i.e. adopt for part 1 an introduction section
like that of part 2.
* Should section 3, part 1 define only the syntax rules?
Currently section 3 mixes the definition of the syntax rules for the
<HTTP-message> elements and its sub-elements along with requirements for
handling communication messages that match or do not match the syntax.
It would seem cleaner to separate out these two discussions.
* Use of angle brackets around syntax elements
The current text does not use angle brackets to isolate and label
protocol elements as against natural language ideas within the text
writeup but it would be useful to distinguish the two. For example,
'HTTP message' should refer to the byte stream exchanged over a
connection and '<HTTP-message>' to the syntax definition to which the
byte stream must conform. Making the distinction explicit would clarify
section 3 of part 1.
REVIEW OF EXISTING REQUIREMENTS:
-------------------------------
Part I
2.4
However, an HTTP-to-HTTP gateway that wishes to interoperate with
third-party HTTP servers MUST conform to HTTP user agent requirements
on the gateway's inbound connection and MUST implement the Connection
(Section 6.1) and Via (Section 6.2) header fields for both
connections.
This requirement does not apply to 'origin servers' but is discussed
because it seems out of place. Does this need to be a requirement? If
so, could it at least be moved to some other section?
If an HTTP-to-HTTP gateway is *defined* as acting as a user-agent
inbound and acting as an origin-server outbound then the first part of
this requirement seems superfluous. Furthermore, I do not see the
benefit of the dependent clause "that wishes to ..."; does it mean that
some HTTP-to-HTTP gateways do not need to be user-agents on the inbound
connection?
The second part raises lots of questions. What does 'implement the ...
header fields' imply? As I read sections 6.1 and 6.2, those rules apply
to all "proxy or gateway" instances making this second part superfluous,
so that it should be changed to a descriptive statement; if this second
part is actually needed, then we need some explanation for when a
gateway must implement those rules and when it does not.
I suspect both parts of this statement should be changed from
requirements to descriptions, something like:
"HTTP-to-HTTP gateways, beyond conforming to all the requirements of
gateways, also necessarily conform to all the requirements applicable to
HTTP user agents on the inbound connections and to all requirements
applicable to HTTP origin servers on the outbound connection and
HTTP-to-HTTP gateways are obliged, by sections 6.1 and 6.2, to ..."
Hence,
servers MUST NOT assume that two requests on the same connection are
from the same user agent unless the connection is secured and
specific to that agent.
This requirement is also tackled because it seems out of place. Does
this need to be a requirement? If it does, could it at least be moved
elsewhere?
While, indeed making this assumption would be wrong, I am not sure this
makes sense as a requirement---it seems more a global statement of fact:
since HTTP is stateless, a server cannot really ever assume that
subsequent requests are coming from the same 'user-agent.' Even with
HTTPS connections, while the proximate 'user-agent' might be the same,
that might merely be a gateway aggregating more distant, ultimate
'user-agents' so even a pipelined set of requests over an HTTPS
connection might be requests from totally different users. (I return to
this below discussing caching of 'private' https responses.) If this is
to remain a requirement, perhaps it should be turned around and
constructed as a statement of when a server *can* assume that two
requests on a given connection are from the same user agent.
From the context of the surrounding discussion, it appears this
requirement is actually a recommendation for writers of other standards
to not make incorrect assumptions; in that case, language of the
statement should be changed to reflect that and not be a formal
requirement of HTTP/1.1.
2.6
In addition to the prose requirements placed upon them, senders MUST
NOT generate protocol elements that do not match the grammar defined
by the ABNF rules for those protocol elements that are applicable to
the sender's role.
This is great as one of the initial, core requirements and, therefore,
should not be buried here nor repeated in other parts (e.g. sec 1.1,
part 2) but be a highly visible requirement of the standard. However, it
could also be written more clearly.
The first clause is superfluous since this is merely one more 'prose
requirement' is it not? Also, 'prose requirement' seems to be trying to
distinguish from other kinds of requirements but my understanding is
that the only requirements are sentences with "MUST" or "SHOULD".
"...MUST only generate ...that match..." would avoid the double negative.
What is a 'protocol element,' is it a 'message' sent over the
communication channel, or an 'element of the ABNF syntax,' or something
else entirely? It seems worth formally defining somewhere. It is first
used in the third paragraph of this section.
Would it not be clearer to spell this out and say something like:
"All senders of messages in the HTTP protocol MUST only send messages
which conform with the syntax of <HTTP-message>, with the <start-line>
either <request-line> for client generated messages or <status-line> for
server generated messages"
using the angle brackets for formal syntax elements? (Instead of 'client
generated' the text might read 'inbound' and similarly for outbound
messages.)
If a received protocol element is processed, the
recipient MUST be able to parse any value that would match the ABNF
rules for that protocol element, excluding only those rules not
applicable to the recipient's role.
The implications of this requirement are unclear. What does 'is
processed' imply? Are all messages sent to an origin server 'processed'
or only some? What does 'able to parse' really imply?
This requirement might be rephrased to state that recipients are
expected to be able to break down the message into all the elements
defined in the ABNF syntax so that other requirements can be made
relative to elements of the syntax. Perhaps this could be:
"All recipients of messages in the HTTP Protocol MUST be able to parse
an HTTP message which matches the syntax of the <HTTP-message> element
into the constituent sub-elements defined by the HTTP syntax."
However, perhaps this is also discussing parsing the header values into
their parts, not just the header field into its header name and header
value; if so, that should be explained and 'protocol element' should not
be reused both in the sense of the message exchanged and in the sense of
a header value.
2.7
When an implementation receives an
unrecognized header field, the recipient MUST ignore that header
field for local processing regardless of the message's HTTP version.
Are any header fields required to be 'recognized'? If so, where is this
requirement stated? Are all implementations expected to 'not ignore' all
of the headers defined in the HTTPbis spec?
Note that 'implementation' is not in the list of targets of section 2.6:
either it should be added to the list or the text modified.
An HTTP server SHOULD send a response version equal to the highest
version to which the server is conformant and whose major version is
less than or equal to the one received in the request.
This would be better phrased as sending a 'response message whose
<http-version> is ...'
Under what circumstances is this not expected of HTTP/1.1 servers? The
SHOULD allows for exceptions so it is useful to state the conditions
where violating this rule is allowed.
An HTTP
server MUST NOT send a version to which it is not conformant.
Again the phrasing could be clearer since servers send *messages* not
"versions."
The 'to which it is not conformant' might be 'to which it does not claim
conformance' given that conformance is never actually determined by
anyone; this requirement really is doing something else--allowing
clients that get a message with a particular version to assume that the
server is conformant with that version.
An HTTP server MAY send an HTTP/1.0 response to an HTTP/1.0 request
if it is known or suspected that the client incorrectly implements
the HTTP specification and is incapable of correctly processing later
version responses, such as when a client fails to parse the version
number correctly or when an intermediary is known to blindly forward
the HTTP-version even when it doesn't conform to the given minor
version of the protocol. Such protocol downgrades
This is probably supposed to say:
"An HTTP server MAY send an HTTP/1.0 response to an *HTTP/1.1*
request ..."
right? The client would be expected, not permitted, to answer a 1.0
request with a 1.0 response.
2.8
The host MUST NOT be empty; if an "http" URI is
received with an empty host, then it MUST be rejected as invalid.
This requirement does not target any of the 'participant{s} in HTTP
communication' of section 2.6 so this should not be a requirement
phrased this way. Instead, it should be clearly stated as an additional
syntax rule.
A more formal approach would define an <http-authority> syntax element,
redefined from the <authority> element in RFC 3986 but including the
element <non-empty-reg-name> instead of <reg-name>, since the <reg-name>
is the only element of the current syntax allowed to be empty.
What does "receiving" an http URI mean? The URI could be in a
<request-target> or in a header or in the body. Also, if 'an "http" URI'
is the thing that matches the syntax of the <http-URI> element, then it
will never have an empty <host> element.
How does the 'reject as invalid' happen? Is this merely repeating that
there is a 'non-empty' syntax rule or are we actually talking about HTTP
requests and 4xx level responses? It is unclear if we are talking about
determining a violation of the syntax or about responding with an HTTP
response.
Senders MUST NOT
include a userinfo subcomponent (and its "@" delimiter) when
transmitting an "http" URI in a message.
Again we seem to be tripping up simply because we are trying to reuse
the syntax elements of RFC 3986. Why not simply define the <http-URI> a
little deeper in syntax and avoid the <authority> of RFC 3986 altogether?
Recipients of HTTP messages
that contain a URI reference SHOULD parse for the existence of
userinfo and treat its presence as an error, likely indicating that
the deprecated subcomponent is being used to obscure the authority
for the sake of phishing attacks.
What does 'treat its presence as an error' imply? Are 'origin-servers'
expected to respond with a 4xx level message or can we 'recover' from
the erroneous URI and handle the message anyhow? And 'contain a URI
reference' seems a bit broad since it could apply to a message body as
well: are we not concerned only with the <request-target> and the
message headers?
the TCP connection MUST be secured for privacy through the use of
strong encryption prior to sending the first HTTP request.
This is out of place and badly phrased.
The section is apparently defining a particular form of Uniform Resource
Identifier (the title of section 8.2) relative to an https schema but
the requirement is talking about message exchange rules. It should
really be in a separate paragraph clearly discussing *usage* not *syntax*.
The passive voice hides the actual target of the injunction; the rule is
really an injunction for the sender of an HTTP request to ensure the
connection is in a particular state prior to transmission. The
requirement should be rewritten to state exactly which target must make
what happen or not happen (and ideally state what response comes when
the requirement is not followed).
Unlike the "http" scheme, responses to "https" identified requests
are never "public" and thus MUST NOT be reused for shared caching.
They can, however, be reused in a private cache...
This requirement reveals my own lack of understanding so I am not sure
how to address it.
I presume the 'public' and 'private' refer to the particular session.
However, does this really work? Does this mean that HTTP is stateless
except for the duration of a TCP connection? I am not sure this is
possible but it seems, with the proper certificates that two clients
could be connected via HTTP/TLS to a proxy while that proxy was
connected by HTTP/TLS to the origin-server. If the server is really
allowed to cache, does this not mean that the wrong client might get the
cached response from the server unless the proxy makes new connections
to the server for each client? There seem to be major security
implications of this requirement that ought to be spelled out for
everyone. Also, this requirement seems linked to the rule from section
2.4 above,
servers MUST NOT assume that two requests on the same connection are
from the same user agent unless ...
but the dependent clause, that the connection is secured, does not
determine the ultimate user agent of the communication when an
intermediary is involved.
I am surely misunderstanding the security design here but that suggests
that this 'public' and 'private' caching system need more explanation.
The passive voice once again hides the 'target' of the injunction.
3
Recipients MUST parse an HTTP message as a sequence of octets in an
encoding that is a superset of US-ASCII [USASCII].
This requirement seems misplaced and is insufficiently specified.
Section 3 claims to be defining the formal syntax of a valid 'HTTP
message' therefore these paragraphs on the parsing of binary messages
seem out of place. Received messages may well be invalid so that
processing is a bigger picture than just processing valid HTTP messages.
(Note that we have an issue distinguishing what is meant by an "HTTP
message," is it a stream of integers which conforms to <HTTP-message>
syntax element or the binary stream communicated over the connection,
which might be total garbage?) These processing requirements should be
split out into a section separate from the syntax rules and expanded.
A discussion about processing needs to describe the transition between
the binary stream sent over inter-machine connections and the
sub-elements from the <HTTP-message> syntax, passing through the
sequence of integers which are the terminal values of the ABNF system
defined by RFC 5234.
This discussion should probably start by requiring that all generated
communication messages be encoded correctly since this is relatively
simple and sets up the processing rules. This requirement would
constrain all senders to only generate streams created by taking a valid
<HTTP-message> and encoding that with an encoding in which the octet
%x00 is never used and all octets in the range from %x01 to %x7F always
represent the integers from 1 to 127. This spells out explicitly the
'superset of US-ASCII' of the requirement above which is vague: UTF-16
is often thought of as a superset of US-ASCII since it includes all the
same characters at all the same code points plus some more.
The discussion could then turn to the processing expectations. The
phrasing 'Recipients MUST parse' is unfortunate since User-Agents can do
whatever they want with the server response, they are not required to
parse them at all. However, the rules of the standard will be based on
the ability of recipients to process the messages so this would be
better written as
"Recipients are expected to be able to process all messages received
over the communication connection according to the rules given here."
(However, this should be compared with the requirement discussed above
in section 2.6.)
The rest might be phrased as
"Recipients MUST assume, when parsing a message received over the
communication connection into the elements of an <HTTP-message>, that in
the initial part of the message up to the first occurrence of either the
octet doublet %x0A %x0A, the octet doublet %x0D %x0D or the octet
triplet %x0A %x0D %x0A, the occurrence of any octet from %x00 to %x7F
stands for the equivalent integer value (from 0 to 127) and encodes the
character at that code point in the US-ASCII character set."
(The latter part is redundant with RFC 5234 but harmless.)
Probably, the handling of the higher octet values should be described
here as well. I think there is a requirement elsewhere that these be
treated as opaque binary values.
Finally, the discussion of processing can get to parsing and should
expand on the 'normal procedure' in the current draft. Parsing should
set out looking for the end-of-line sequences up to the blank line.
Having found the end of the headers, we can translate all the octets in
the ASCII range to their ASCII equivalents and then parse the result
into the elements of the ABNF syntax.
Probably, the discussion should also explain what is expected of a
recipient getting a message which does not match the syntax, namely that
origin servers send back a message in the 4xx range, gateways and
proxies I am not sure, and user-agents can decide for themselves.
3.1
Implementations MUST NOT send whitespace between the start-line and
the first header field.
To the extent that senders are required to follow the ABNF syntax, this
requirement is redundant.
"Implementations" should be "Senders" since "Implementations" are not in
the list of targets in section 2.6.
Based on the security discussion after the requirement, this requirement
may be trying to eliminate messages of this kind entirely. If so,
perhaps you need to require recipients to reject such messages outright,
rather than attempt to recover.
A server MUST be able to parse any received message that begins with
a request-line and matches the ABNF rule for HTTP-message.
For the reasons explained above, I would separate out this processing
requirement away from the syntax rules. Also 'be able to parse' is
really 'be able to identify the sub-elements of the <HTTP-message>
syntax element,' that is 'able to parse into $something.'
Recipients of an invalid request-line SHOULD respond
with either a 400 (Bad Request) error or a 301 (Moved Permanently)
redirect with the request-target properly encoded.
Recipients SHOULD
NOT attempt to autocorrect and then process the request without a
redirect, since the invalid request-line might be deliberately
crafted to bypass security filters along the request chain.
These rules should be subsumed to a general processing system.
"Recipients" is too broad since user-agents are 'recipients' but clearly
should not 'respond'. This could be clarified with 'Recipients of
inbound messages' except that we probably want to exclude tunnels as well.
It would be better to state that
"Recipients of inbound messages which do not match the syntax of the
<HTTP-message> element SHOULD respond, unless they can correct the
syntax of the <request-target> element, with a a 400 (Bad Request)."
This might be extended with
"Recipients of inbound messages with invalid <request-line> elements
which can be corrected by properly encoding the <request-target> MAY
respond with a 301 (Moved Permanently) response with a "Location:"
header containing the properly encoded target reconstructed into a URI
{or a <path>?} but MUST NOT process the request without a redirect since
..."
although this might only apply to origin servers and might apply only if
they can ascertain that the corrected resource really exists.
A server that receives a method longer than any that it
implements SHOULD respond with either a 405 (Method Not Allowed), if
it is an origin server, or a 501 (Not Implemented) status code.
The clause "if it is an origin server" seems strange here. Why does it
apply only to the former response? It would seem 'not implemented' would
be a characteristic of the origin server as well.
A
server MUST be prepared to receive URIs of unbounded length and
respond with the 414 (URI Too Long) status code if the received
request-target would be longer than the server wishes to handle (see
Section 4.6.12 of [Part2]).
This requirement talks of URI but I presume it is really discussing the
<request-target> element. Also, why can the server not simply be ready
to receive a <request-target> up to its maximum allowable length, rather
than speaking of 'unbounded length'? It seems this is really a note to
implementors: "watch out <request-targets> may be realy long, deal with
them if they are too big by ..." which is not really a requirement.
"Since <request-target> elements can be quite large, servers MAY respond
to a request with a <request-target> element which is too large for the
implementation with an HTTP response message using the 414 (URI Too
Long) status code."
A client MUST be able to parse any received message that begins with
a status-line and matches the ABNF rule for HTTP-message.
This repeats the earlier requirement from section 2.6 unless I
misunderstood that earlier requirement. Since 'parse' just means 'cut
up' so this requirement needs to specify into what clients must be able
to parse, i.e. into the HTTP syntax elements defined earlier.
3.2
New HTTP header fields SHOULD be registered with IANA according to
the procedures in Section 3.1 of [Part2].
This is not a requirement of any of the targets laid out in section 2.6
but a different level of 'should' applicable to the community. This
requirement should be rewritten as a separate paragraph discussing
general organization around HTTP. This also does not apply to
experimental header names.
Unrecognized
header fields SHOULD be ignored by other recipients.
If this is a conditional requirement, the condition needs to be spelled
out: when may recipients freak out due to the presence of unrecognized
headers?
A server MUST
wait until the entire header section is received before interpreting
a request message, since later header fields might include
conditionals, authentication credentials, or deliberately misleading
duplicate header fields that would impact request processing.
The justification should be split into a separate sentence. Also, the
language makes it seem like a good idea to wait until misleading
information is received so the text needs to be cleared up.
Multiple header fields with the same field name MUST NOT be sent in a
message unless the entire field value for that header field is
defined as a comma-separated list [i.e., #(values)].
The language should be flipped around to have
"Senders MUST NOT send HTTP messages with multiple header fields that
have the same name unless ..."
OWS SHOULD either not be produced or be produced as a
single SP. Multiple OWS octets that occur within field-content
SHOULD either be replaced with a single SP or transformed to all SP
octets (each octet other than SP replaced with SP) before
interpreting the field value or forwarding the message downstream.
These rules are trying to constrain a more general syntax without
actually altering the syntax grammar itself which is a silly way to
proceed. This is probably necessary due to backwards compatibility
issues; if so, that should be made clear.
Since this is a SHOULD, when would it make sense for senders to generate
messages with multiple space OWS?
The 'produced' is not the same language as in section 2.6 which talked
of 'generate' versus 'send'.
The 'multiple OWS octets' probably needs a notion of contiguity. Again,
when would it make sense not to perform this transformation, that is
does this injunction not really mean that the interpretation of the
message must be identical to the interpretation if the substitution were
performed?
Note that these issues also apply to the next paragraphs, on RWS and BWS.
Any received request message that contains whitespace between a
header field-name and colon MUST be rejected with a response code of
400 (Bad Request).
Again the language should be flipped around to identify the target (and
exclude a client getting a request by mistake.)
"Inbound Recipients MUST reject an HTTP request message that contains
... colon with a response code of ..."
A field value MAY be preceded by optional whitespace (OWS); a single
SP is preferred.
This is redundant with the syntax rules and therefore confusing: if the
syntax allows it, an implementation MAY produce it. Therefore it seems
this injunction is merely stating a preference for this OWS to be
present and be a single space; that does not need an injunction.
The field value does not include any leading or
trailing white space: OWS occurring before the first non-whitespace
octet of the field value or after the last non-whitespace octet of
the field value is ignored and SHOULD be removed before further
processing (as this does not change the meaning of the header field).
This mixes a rule for syntax with a processing rule: that would be less
likely to happen with a formal approach to injunctions in which the
target were the subject of the sentences. The requirement should be split.
Note that this definition of the field value is much narrow than what is
allowed by the definition of the <field-value> element in the syntax, so
that needs to be clarified. (See my proposal in a separate email.)
HTTP senders MUST NOT produce messages that include
line folding (i.e., that contain any field-value that matches the
obs-fold rule) unless the message is intended for packaging within
the message/http media type.
The "intended for packaging" is a little indirect. Is this really saying
that the folding is fine if the message will be the body of another message?
HTTP recipients SHOULD accept line
folding and replace any embedded obs-fold whitespace with either a
single SP or a matching number of SP octets (to avoid buffer copying)
prior to interpreting the field value or forwarding the message
downstream.
The first part of this 'SHOULD accept line folding' is redundant to the
requirement that all recipients MUST be able to parse messages with
valid syntax.
If the SHOULD applies to the "...replace any ..." as well, then it needs
to specify under what conditions a recipient would be allowed to not
replace? Would it be better to phrase this part to state that the
interpretation must be equivalent to an interpretation obtained if the
OWS were replaced?
Newly defined header fields SHOULD limit their
field values to US-ASCII octets.
Header fields are not 'targets' of this specification. Indeed this is a
requirement for the procedure of defining new headers rather than for
actors of the HTTP protocol. This should either become a recommendation
to those defining new field values or a constraint on the generators of
messages with new field values.
Recipients SHOULD treat other (obs-
text) octets in field content as opaque data.
This seems strange in that header fields are sent to recipients to
convey meaning so some recipients are going to have to actually use the
octets in some way meaning they will not treat them as opaque but take
action based on their content. This injunction is probably intended to
apply to the original parsing of octets by recipients which do not
recognize the header field.
The use of 'other' means that the injunction does not stand alone as a
statement but requires its context. It would be better if each
injunction were understandable on its own outside of its context.
A server MUST be prepared
to receive request header fields of unbounded length and respond with
a 4xx (Client Error) status code if the received header field(s)
would be longer than the server wishes to handle.
Again, this turn of phrase seems unfortunate: the phrase "prepared to
receive .... of unbounded length" makes it seem that servers need to
handle what cannot be handled. This could be
"A server MUST be prepared to receive a header field longer than the
server can handle, in which case the server MUST respond with ..."
or
"A server MUST respond, when receiving a header field longer than the
server wishes to handle, with an HTTP response using a 4xx level (Client
Error) status code."
Note that since this is not the only injunction mandating a return type,
this injunction probably needs a clause "unless another error response
is also required for the messsage" and an overarching rule for servers
required by multiple rules to respond with an exception code.
These special
characters MUST be in a quoted string to be used within a parameter
value (as defined in Section 4).
This is another case of a requirement not being made of a 'target'. It
is also another case of a requirement being redundant to the syntax.
This is merely a description of the consequences of the syntax not a new
requirement in its own right.
Recipients that process the value of the quoted-string MUST handle a
quoted-pair as if it were replaced by the octet following the
backslash.
ok
Senders SHOULD NOT escape octets in quoted-strings that do not
require escaping (i.e., other than DQUOTE and the backslash octet).
ok
I am stopping this review here, although this only gets us to Section
3.3 'Message Body' of part 1.
From this review, it seems that all of the injunctions merit a careful
re-reading. Also, if these injunctions are to be the canonical source of
normative text for the standard, then a formal review of the content of
these sentences would be useful. In particular, I imagine there is a
need for some extra, core injunctions to the standard such as:
* all inbound recipients MUST respond to all received HTTP requests with
an HTTP response,
* HTTP responses SHOULD, unless some new use pattern has been
discovered, use error codes in the 1xx range to indicate ..., in the 2xx
range for ...,
and others.
I hope this is not overwhelming. I have been undertaking similar work
for the standards at the OGC and, while it takes a large effort to
complete, it seems worth the while since it makes for much better documents.
sincerely,
~Adrian Custer
Received on Wednesday, 5 September 2012 17:13:56 UTC