- From: Jo Rabin <jrabin@mtld.mobi>
- Date: Wed, 23 May 2007 19:46:06 +0100
- To: <public-mobileok-checker@w3.org>
Are you sitting comfortably? Then I'll begin. [1] 1. Structure of HTTP Header Field Values It seems to be left to the creativity of the inventor of the header. There are some patterns. a. The value is a URI b. The value is a Date c. The value is an ETag or set of ETags, separated by spaces d. The value is a space separated list of product/comments e. As follows which seems to be derived from MIME, ignoring white space for a moment (based on similar in HTTP in RDF): message-header = field-name ":" [ field-value ] field-value = [ header-element ] *( "," [ header-element ] ) header-element = element-name [ "=" [ element-value ] ] *( ";" [ param ] ) param = param-name [ "=" [ param-value ] ] param-value = (token | quoted-string) i.e. A field value is an element optionally followed by comma separated elements. An element is: a name e.g. Cache-Control: nocache - optionally followed by a value e.g. Cache-Control: post-check=0 - optionally followed by any number of ; separated parameters A parameter is A name - optionally followed by a value e.g. Accept: text/plain;q=0.1 2. Normalization Rules: a. In all cases normalize the HTTP Header Field name to lower-case. [That could be Camel-Case to match the style of the spec) b. for case 1 e. above i. normalize element and parameter values (following =) unless it is quoted ii. remove the quotes on quoted strings. iii. un-escape escaped characters e.g. \" in quoted strings c. for uris canonicalize ??? 3. Parsing Rules p -- parse according to the pattern in 1.e. px -- parse as authentication request py -- parse as authorization credentials uri -- a URI do not parse E[*] -- ETag [sequence], treat as a comma separated list, do not parse values date -- do not parse ?? parse as an HTTP date and turn into a W3C date?? ppc -- parse as a space separated list of HTTP product and HTTP comment values ps - parse as a space separated list. 4. HTTP Headers and their Parsing and normalization rules HTTP defined Headers: Accept: p,n Accept-Charset: p,n Accept-Encoding: p,n Accept-Language: p,n Accept-Ranges: p,n Age: p,n Allow: p Authorization: py Cache-Control: p,n Connection: p,n Content-Encoding: p,n Content-Language: p,n Content-Length: p,n Content-Location: uri Content-MD5: - Content-Range: p,n Content-Type: p, n Date: date ETag: E Expect: p, n Expires: date From: email Host: p, n If-Match: E If-Modified-Since: date If-None-Match: E* If-Range: E/date If-Unmodified-Since: date Last-Modified: date Location: uri Max-Forwards: p Pragma: p, n Proxy-Authenticate: px Proxy-Authorization: py Range: p, n Referer: uri Retry-After: date Server: ppc TE: p, n Trailer: p, n Transfer-Encoding: p, n Upgrade: p User-Agent: pc Vary: p, n Via: ppc Warning: ps WWW-Authenticate: px Other Headers HTTP in RDF [2] lists headers found in [RFC 4229] as follows. We could define parseing rules for them or we could leave them unparsed and unnormalized. We don't actually used any of their values, I think. In any case, headers that are not recognized should be unparsed and unnormalised other than as specified by HTTP for leading and trailing white space. * accept-additions representing an Accept-Additions header (defined in [RFC 2324]), * accept-features representing an Accept-Features header (defined in [RFC 2295]), * alternates representing an Alternates header (defined in [RFC 2295]), * authentication-info representing an Authentication-Info header (defined in [RFC 2617]), * a-im representing an A-IM header (defined in [RFC 3229]), * compliance representing a Compliance header (defined in [OPTIONS messages]), * content-base representing a Content-Base header (defined in [RFC 2068]), * content-disposition representing a Content-Disposition header (defined in [RFC 2183]), * content-id representing a Content-ID header (defined in [DRP]), * content-script-type representing a Content-Script-Type header (defined in [HTML4]), * content-style-type representing a Content-Style-Type header (defined in [HTML4]), * content-transfer-encoding representing a Content-Transfer-Encoding header (defined in [ObjectHeaders]), * content-version representing a Content-Version header (defined in [RFC 2068]), * cookie representing an Cookie header (defined in [RFC 2965]), * cookie2 representing an Cookie2 header (defined in [RFC 2965]), * cost representing a Cost header (defined in [ObjectHeaders]), * c-ext representing a C-Ext header (defined in [RFC 2774]), * c-man representing a C-Man header (defined in [RFC 2774]), * c-opt representing a C-Opt header (defined in [RFC 2774]), * c-pep representing a C-PEP header (defined in [PEP]), * c-pep-info representing a C-PEP-Info header (defined in [PEP]), * dav representing a DAV header (defined in [RFC 2518]), * default-style representing a Default-Style header (defined in [HTML4]), * delta-base representing a Delta-Base header (defined in [RFC 3229]), * depth representing a Depth header (defined in [RFC 2518]), * derived-from representing a Derived-From header (defined in [RFC 2068]), * destination representing a Destination header (defined in [RFC 2518]), * differential-id representing a Differential-ID header (defined in [DRP]), * digest representing a Digest header (defined in [RFC 3230]), * ext representing an Ext header (defined in [RFC 2774]), * getprofile representing a GetProfile header (defined in [Ops-OverHTTP]), * if representing an If header (defined in [RFC 2518]), * im representing an IM header (defined in [RFC 3229]), * label representing a Label header (defined in [RFC 3253]), * link representing a Link header (defined in [RFC 2068]), * lock-token representing a Lock-Token header (defined in [RFC 2518]), * man representing a Man header (defined in [RFC 2774]), * message-id representing a Message-ID header (defined in [ObjectHeaders]), * meter representing a Meter header (defined in [RFC 2227]), * negotiate representing an Negotiate header (defined in [RFC 2295]), * non-compliance representing a Non-Compliance header (defined in [OPTIONS messages]), * opt representing an Opt header (defined in [RFC 2774]), * optional representing an Optional header (defined in [WIRE]), * ordering-type representing an Ordering-Type header (defined in [RFC 3648]), * overwrite representing an Overwrite header (defined in [RFC 2518]), * p3p representing a P3P header (defined in [P3P]), * pep representing a PEP header (defined in [PEP]), * pep-info representing a PEP-Info header (defined in [PEP]), * pics-label representing a PICS-Label header (defined in [PICSLabels]), * position representing a Position header (defined in [RFC 3648]), * profileobject representing a ProfileObject header (defined in [Ops-OverHTTP]), * protocol representing a Protocol header (defined in [PICSLabels]), * protocol-info representing a Protocol-Info header (defined in [JEPI]), * protocol-query representing a Protocol-Query header (defined in [JEPI]), * protocol-request representing a Protocol-Request header (defined in [PICSLabels]), * proxy-authentication-info representing a Proxy-Authentication-Info header (defined in [RFC 2617]), * proxy-features representing a Proxy-Features header (defined in [Proxy Notification]), * proxy-instruction representing a Proxy-Instruction header (defined in [Proxy Notification]), * public representing a Public header (defined in [RFC 2068]), * refresh representing a Refresh header (defined in [EDD]), * resolution-hint representing a Resolution-Hint header (defined in [WIRE]), * resolver-location representing a Resolver-Location header (defined in [WIRE]), * safe representing a Safe header (defined in [RFC 2310]), * security-scheme representing a Security-Scheme header (defined in [RFC 2660]), * setprofile representing a SetProfile header (defined in [Ops-OverHTTP]), * set-cookie representing a Set-Cookie header (defined in [RFC 2109]), * set-cookie2 representing a Set-Cookie2 header (defined in [RFC 2965]), * soapaction representing a SoapAction header (defined in [SOAP1.1]), * status-uri representing a Status-URI header (defined in [RFC 2518]), * subok representing a SubOK header (defined in [DupSup]), * subst representing a Subst header (defined in [DupSup]), * surrogate-capability representing a Surrogate-Capability header (defined in [EdgeArch]), * surrogate-control representing a Surrogate-Control header (defined in [EdgeArch]), * tcn representing a TCN header (defined in [RFC 2295]), * timeout representing a Timeout header (defined in [RFC 2518]), and * title representing a Title header (defined in [ObjectHeaders]), * ua-color representing a UA-Color header (defined in [UA Attributes]), * ua-media representing a UA-Media header (defined in [UA Attributes]), * ua-pixels representing a UA-Pixels header (defined in [UA Attributes]), * ua-resolution representing a UA-Resolution header (defined in [UA Attributes]), * ua-windowpixels representing a UA-Windowpixels header (defined in [UA Attributes]), and * uri representing a URI header (defined in [RFC 2068]). * variant-vary representing a Variant-Vary header (defined in [RFC 2295]), and * version representing a Version header (defined in [ObjectHeaders]). * want-digest representing a Want-Digest header (defined in [RFC 3230]). Jo [1] http://www.turnipnet.com/radio/lwm.wav [2] http://www.w3.org/TR/HTTP-in-RDF/
Received on Wednesday, 23 May 2007 18:46:28 UTC