- From: Jo Rabin <jrabin@mtld.mobi>
- Date: Wed, 23 May 2007 19:46:06 +0100
- To: <public-mobileok-checker@w3.org>
Are you sitting comfortably? Then I'll begin. [1]
1. Structure of HTTP Header Field Values
It seems to be left to the creativity of the inventor of the header.
There are some patterns.
a. The value is a URI
b. The value is a Date
c. The value is an ETag or set of ETags, separated by spaces
d. The value is a space separated list of product/comments
e. As follows which seems to be derived from MIME, ignoring white space
for a moment (based on similar in HTTP in RDF):
message-header = field-name ":" [ field-value ]
field-value = [ header-element ] *( "," [ header-element ] )
header-element = element-name [ "=" [ element-value ] ] *( ";" [ param
] )
param = param-name [ "=" [ param-value ] ]
param-value = (token | quoted-string)
i.e.
A field value is
an element optionally followed by comma separated elements.
An element is:
a name e.g. Cache-Control: nocache
- optionally followed by a value
e.g. Cache-Control: post-check=0
- optionally followed by any number of ; separated parameters
A parameter is
A name
- optionally followed by a value
e.g. Accept: text/plain;q=0.1
2. Normalization Rules:
a. In all cases normalize the HTTP Header Field name to lower-case.
[That could be Camel-Case to match the style of the spec)
b. for case 1 e. above
i. normalize element and parameter values (following =) unless it is
quoted
ii. remove the quotes on quoted strings.
iii. un-escape escaped characters e.g. \" in quoted strings
c. for uris canonicalize ???
3. Parsing Rules
p -- parse according to the pattern in 1.e.
px -- parse as authentication request
py -- parse as authorization credentials
uri -- a URI do not parse
E[*] -- ETag [sequence], treat as a comma separated list, do not parse
values
date -- do not parse ?? parse as an HTTP date and turn into a W3C date??
ppc -- parse as a space separated list of HTTP product and HTTP comment
values
ps - parse as a space separated list.
4. HTTP Headers and their Parsing and normalization rules
HTTP defined Headers:
Accept: p,n
Accept-Charset: p,n
Accept-Encoding: p,n
Accept-Language: p,n
Accept-Ranges: p,n
Age: p,n
Allow: p
Authorization: py
Cache-Control: p,n
Connection: p,n
Content-Encoding: p,n
Content-Language: p,n
Content-Length: p,n
Content-Location: uri
Content-MD5: -
Content-Range: p,n
Content-Type: p, n
Date: date
ETag: E
Expect: p, n
Expires: date
From: email
Host: p, n
If-Match: E
If-Modified-Since: date
If-None-Match: E*
If-Range: E/date
If-Unmodified-Since: date
Last-Modified: date
Location: uri
Max-Forwards: p
Pragma: p, n
Proxy-Authenticate: px
Proxy-Authorization: py
Range: p, n
Referer: uri
Retry-After: date
Server: ppc
TE: p, n
Trailer: p, n
Transfer-Encoding: p, n
Upgrade: p
User-Agent: pc
Vary: p, n
Via: ppc
Warning: ps
WWW-Authenticate: px
Other Headers
HTTP in RDF [2] lists headers found in [RFC 4229] as follows. We could
define parseing rules for them or we could leave them unparsed and
unnormalized. We don't actually used any of their values, I think. In
any case, headers that are not recognized should be unparsed and
unnormalised other than as specified by HTTP for leading and trailing
white space.
* accept-additions representing an Accept-Additions header (defined
in [RFC 2324]),
* accept-features representing an Accept-Features header (defined in
[RFC 2295]),
* alternates representing an Alternates header (defined in [RFC
2295]),
* authentication-info representing an Authentication-Info header
(defined in [RFC 2617]),
* a-im representing an A-IM header (defined in [RFC 3229]),
* compliance representing a Compliance header (defined in [OPTIONS
messages]),
* content-base representing a Content-Base header (defined in [RFC
2068]),
* content-disposition representing a Content-Disposition header
(defined in [RFC 2183]),
* content-id representing a Content-ID header (defined in [DRP]),
* content-script-type representing a Content-Script-Type header
(defined in [HTML4]),
* content-style-type representing a Content-Style-Type header
(defined in [HTML4]),
* content-transfer-encoding representing a Content-Transfer-Encoding
header (defined in [ObjectHeaders]),
* content-version representing a Content-Version header (defined in
[RFC 2068]),
* cookie representing an Cookie header (defined in [RFC 2965]),
* cookie2 representing an Cookie2 header (defined in [RFC 2965]),
* cost representing a Cost header (defined in [ObjectHeaders]),
* c-ext representing a C-Ext header (defined in [RFC 2774]),
* c-man representing a C-Man header (defined in [RFC 2774]),
* c-opt representing a C-Opt header (defined in [RFC 2774]),
* c-pep representing a C-PEP header (defined in [PEP]),
* c-pep-info representing a C-PEP-Info header (defined in [PEP]),
* dav representing a DAV header (defined in [RFC 2518]),
* default-style representing a Default-Style header (defined in
[HTML4]),
* delta-base representing a Delta-Base header (defined in [RFC
3229]),
* depth representing a Depth header (defined in [RFC 2518]),
* derived-from representing a Derived-From header (defined in [RFC
2068]),
* destination representing a Destination header (defined in [RFC
2518]),
* differential-id representing a Differential-ID header (defined in
[DRP]),
* digest representing a Digest header (defined in [RFC 3230]),
* ext representing an Ext header (defined in [RFC 2774]),
* getprofile representing a GetProfile header (defined in
[Ops-OverHTTP]),
* if representing an If header (defined in [RFC 2518]),
* im representing an IM header (defined in [RFC 3229]),
* label representing a Label header (defined in [RFC 3253]),
* link representing a Link header (defined in [RFC 2068]),
* lock-token representing a Lock-Token header (defined in [RFC
2518]),
* man representing a Man header (defined in [RFC 2774]),
* message-id representing a Message-ID header (defined in
[ObjectHeaders]),
* meter representing a Meter header (defined in [RFC 2227]),
* negotiate representing an Negotiate header (defined in [RFC
2295]),
* non-compliance representing a Non-Compliance header (defined in
[OPTIONS messages]),
* opt representing an Opt header (defined in [RFC 2774]),
* optional representing an Optional header (defined in [WIRE]),
* ordering-type representing an Ordering-Type header (defined in
[RFC 3648]),
* overwrite representing an Overwrite header (defined in [RFC
2518]),
* p3p representing a P3P header (defined in [P3P]),
* pep representing a PEP header (defined in [PEP]),
* pep-info representing a PEP-Info header (defined in [PEP]),
* pics-label representing a PICS-Label header (defined in
[PICSLabels]),
* position representing a Position header (defined in [RFC 3648]),
* profileobject representing a ProfileObject header (defined in
[Ops-OverHTTP]),
* protocol representing a Protocol header (defined in [PICSLabels]),
* protocol-info representing a Protocol-Info header (defined in
[JEPI]),
* protocol-query representing a Protocol-Query header (defined in
[JEPI]),
* protocol-request representing a Protocol-Request header (defined
in [PICSLabels]),
* proxy-authentication-info representing a Proxy-Authentication-Info
header (defined in [RFC 2617]),
* proxy-features representing a Proxy-Features header (defined in
[Proxy Notification]),
* proxy-instruction representing a Proxy-Instruction header (defined
in [Proxy Notification]),
* public representing a Public header (defined in [RFC 2068]),
* refresh representing a Refresh header (defined in [EDD]),
* resolution-hint representing a Resolution-Hint header (defined in
[WIRE]),
* resolver-location representing a Resolver-Location header (defined
in [WIRE]),
* safe representing a Safe header (defined in [RFC 2310]),
* security-scheme representing a Security-Scheme header (defined in
[RFC 2660]),
* setprofile representing a SetProfile header (defined in
[Ops-OverHTTP]),
* set-cookie representing a Set-Cookie header (defined in [RFC
2109]),
* set-cookie2 representing a Set-Cookie2 header (defined in [RFC
2965]),
* soapaction representing a SoapAction header (defined in
[SOAP1.1]),
* status-uri representing a Status-URI header (defined in [RFC
2518]),
* subok representing a SubOK header (defined in [DupSup]),
* subst representing a Subst header (defined in [DupSup]),
* surrogate-capability representing a Surrogate-Capability header
(defined in [EdgeArch]),
* surrogate-control representing a Surrogate-Control header (defined
in [EdgeArch]),
* tcn representing a TCN header (defined in [RFC 2295]),
* timeout representing a Timeout header (defined in [RFC 2518]), and
* title representing a Title header (defined in [ObjectHeaders]),
* ua-color representing a UA-Color header (defined in [UA
Attributes]),
* ua-media representing a UA-Media header (defined in [UA
Attributes]),
* ua-pixels representing a UA-Pixels header (defined in [UA
Attributes]),
* ua-resolution representing a UA-Resolution header (defined in [UA
Attributes]),
* ua-windowpixels representing a UA-Windowpixels header (defined in
[UA Attributes]), and
* uri representing a URI header (defined in [RFC 2068]).
* variant-vary representing a Variant-Vary header (defined in [RFC
2295]), and
* version representing a Version header (defined in
[ObjectHeaders]).
* want-digest representing a Want-Digest header (defined in [RFC
3230]).
Jo
[1] http://www.turnipnet.com/radio/lwm.wav
[2] http://www.w3.org/TR/HTTP-in-RDF/
Received on Wednesday, 23 May 2007 18:46:28 UTC