- From: David Robinson <drtr1@cam.ac.uk>
- Date: Tue, 2 May 95 15:29 BST
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
- Cc: drtr1@cus.cam.ac.uk
Some comments on the HTTP/1.0 spec from 8th March. 2.1 `Augmentent BNF' The spec describes where whitespace is allowed: implied *LWS The grammar described by this specification is word-based. Except where noted otherwise, zero or more linear whitespace (LWS) can be included between any two words (token or quoted-string) without changing the interpretation of a field. Presumably the words have to be adjacent! This rule does not allow LWS in places where it should, i.e. between a word and a tspecial in some circumstances. Specifically: * after ":" in a header: from section 4.2 `Message Headers' HTTP-header = field-name ":" [field-value] CRLF does not allow LWS between the ":" and the field value; neither does the specifications for individual headers. * after ";" in media-type values: e.g. section 5.4.1 `Accept' defines the header as Accept = "Accept" ":" 1# ( media-range [";" "q" "=" ("0" | "1" | float)] [";" "mxb" "=" 1*DIGIT]) media-range = ( "*/*" | ( type "/" "*" ) | ( type "/" subtype ) ) * (";" parameter ) this does not allow LWS between the ";" and the q, mxb or other parameters. Similarly in sections 7.1.10 `Link', 7.1.13 `URI', 8.1 `Media Types'. 2.2 `Basic Rules' OCTET = <any 8-bit character> ~~~~~~~~~ The text rule is only used for descriptive field contents. Words of *text may contain characters from character sets other than US-ASCII only when encoded according to the rules of RFC 1522 [13]. text = <any OCTET except CTLs, but including LWS> ~~~~~ Is it necessary to allow any OCTET in `text', rather than just any CHAR? It isn't defined what character set these 8-bit characters come from. I suggest either changing the OCTET rule to be 'any 8-bit ISO-Latin-1 character', or avoiding using OCTET anywhere in a header specification. (I would much prefer the latter.) 4.2 `Message-Headers' Although the specification for the individual headers use case insensitive names, it should be specified in section 4.2 that field-name is case insensitive. Otherwise an extension-header could have a case sensitive name. 5.3 `Request-URI' The Request-URI is a Universal Resource Identifier (Section 3.2) and identifies the resource upon which to apply the request. Request-URI = URI Unless the server is being used as a proxy, a partial URI shall be given ~~~~~~~~~~~~~ with the assumptions of the scheme (http) and hostname:port (the server's address) being obvious. That is, if the full URI looks like http://info.cern.ch/hypertext/WWW/TheProject.html then the corresponding partial URI in the Simple-Request or Full-Request is /hypertext/WWW/TheProject.html I don't think that `partial URI' is either properly or correctly defined. `A partial URI' is not defined by the HTTP spec, nor is a reference given. The term `partial URI' is mentioned in RFC 1630, and sounds synonymous with a `relative URL', which is defined (differently) in draft-ietf-uri-relative-url-06.txt. Taking the latter definition, //info.cern.ch/hypertext/WWW/TheProject.html and hypertext/WWW/TheProject.html are also relative URLs for the document. Avoiding the first possiblity by restricting `partial URI' to be an `abs_path relative URL' [Roy Fielding, pers. comm.] helps, but there is then the problem that an abs_path cannot start with "//". Thus transfer of URIs like http://host.name//void would be impossible. Also, is a empty string allowed for `partial URI'? I would suggest codifying the current common browser behaviour, which is: 1. Remove the scheme-part, the "://" and the hostport part from the URI. 2. If the partial URI does not begin with a "/", then preprend a "/" Rule 2 is only ever applied if the path component of the URI is void. [I note that the URI spec (RFC 1630) allows http://host.name?search whereas the URL spec (RFC 1738) does not.] As http://host.name and http://host.name/ are supposed to be identical URIs, it makes sense to specify a single partial URI for both of them. 10. `Access Authentication' This doesn't state whether auth-scheme is case-insensitive or not. It has auth-scheme = "Basic" | token Thus the value "Basic" is case insensitive (see section 2.1), but other values are not. David Robinson.
Received on Tuesday, 2 May 1995 07:31:22 UTC