- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Sat, 17 Mar 2007 09:49:16 +0100
- To: HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <45FBAB8C.6010809@gmx.de>
Hi,
here are some thoughts for Issue 36, the upgrade of the ABNF syntax (see
</projects/w3c/WWW/Protocols/HTTP/1.1/rfc2616bis>):
1) I think there is agreement that the ABNF syntax needs to be upgraded
to the syntax defined in RFC4234.
2) What's not so clear is what exact format we want to use. Currently,
RFC2616 uses a format based on RFC822 (see
<http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.2.1>)m but
augments it with a special "list" syntax, which makes the notation for
many headers much easier to read (see
<http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.2.1.p.9>). We
need to decide whether we want to keep that extension in RFC2616bis, or
just live with the RFC4234 syntax.
3) Related to that, RFC2616 defines special treatment of Linear White
Space (LWS), see
<http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.2.1.p.11>).
It's up for discussion whether we want to keep that aspect. See
<http://lists.w3.org/Archives/Public/ietf-http-wg/2007JanMar/0116.html>
for some comments from Roy Fielding about that subject.
So how do we proceed?
I think it would be extremely useful to be able to extract & parse the
embedded ABNF productions, generating somewhat normalized output. That
would allow us to ensure that while we are updating the spec, we don't
break the grammar (or actually, that we do not introduce *unintented*
changes).
Extraction is no problem. The XML source
(<http://www.w3.org/Protocols/HTTP/1.1/rfc2616bis/draft-lafon-rfc2616bis-latest.xml>)
already has the ABNF fragments marked up accordingly. A tool for
extraction is available in my rfc2629.xslt package
(<http://greenbytes.de/tech/webdav/rfc2629xslt/rfc2629xslt.html#extract-artwork>)
-- I'm attaching the output.
The next step would be to run this through a parser. Here's where we
have the first problems:
- we'd need a parser for the RFC2616 variant of the ABNF syntax (does
anybody have that?), and
- the ABNF itself must parse. Currently there's at least one problem with
qdtext = <any TEXT except <">>
which IMHO needs to be fixed not to contain the literals "<" and ">". I
would like to make that change with the next draft we submit.
Optimally, we would then find a parser that will accept both
rfc2616-style ABNF and RFC4234-style ABNF, and which will re-serialize
into a single RFC4234-based format that we can then use to verify changes.
Best regards, Julian
OCTET = <any 8-bit sequence of data>
CHAR = <any US-ASCII character (octets 0 - 127)>
UPALPHA = <any US-ASCII uppercase letter "A".."Z">
LOALPHA = <any US-ASCII lowercase letter "a".."z">
ALPHA = UPALPHA | LOALPHA
DIGIT = <any US-ASCII digit "0".."9">
CTL = <any US-ASCII control character
(octets 0 - 31) and DEL (127)>
CR = <US-ASCII CR, carriage return (13)>
LF = <US-ASCII LF, linefeed (10)>
SP = <US-ASCII SP, space (32)>
HT = <US-ASCII HT, horizontal-tab (9)>
<"> = <US-ASCII double-quote mark (34)>
CRLF = CR LF
LWS = [CRLF] 1*( SP | HT )
TEXT = <any OCTET except CTLs,
but including LWS>
HEX = "A" | "B" | "C" | "D" | "E" | "F"
| "a" | "b" | "c" | "d" | "e" | "f" | DIGIT
token = 1*<any CHAR except CTLs or separators>
separators = "(" | ")" | "<" | ">" | "@"
| "," | ";" | ":" | "\" | <">
| "/" | "[" | "]" | "?" | "="
| "{" | "}" | SP | HT
comment = "(" *( ctext | quoted-pair | comment ) ")"
ctext = <any TEXT excluding "(" and ")">
quoted-string = ( <"> *(qdtext | quoted-pair ) <"> )
qdtext = <any TEXT except <">>
quoted-pair = "\" CHAR
HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT
http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
HTTP-date = rfc1123-date | rfc850-date | asctime-date
rfc1123-date = wkday "," SP date1 SP time SP "GMT"
rfc850-date = weekday "," SP date2 SP time SP "GMT"
asctime-date = wkday SP date3 SP time SP 4DIGIT
date1 = 2DIGIT SP month SP 4DIGIT
; day month year (e.g., 02 Jun 1982)
date2 = 2DIGIT "-" month "-" 2DIGIT
; day-month-year (e.g., 02-Jun-82)
date3 = month SP ( 2DIGIT | ( SP 1DIGIT ))
; month day (e.g., Jun 2)
time = 2DIGIT ":" 2DIGIT ":" 2DIGIT
; 00:00:00 - 23:59:59
wkday = "Mon" | "Tue" | "Wed"
| "Thu" | "Fri" | "Sat" | "Sun"
weekday = "Monday" | "Tuesday" | "Wednesday"
| "Thursday" | "Friday" | "Saturday" | "Sunday"
month = "Jan" | "Feb" | "Mar" | "Apr"
| "May" | "Jun" | "Jul" | "Aug"
| "Sep" | "Oct" | "Nov" | "Dec"
delta-seconds = 1*DIGIT
charset = token
content-coding = token
transfer-coding = "chunked" | transfer-extension
transfer-extension = token *( ";" parameter )
parameter = attribute "=" value
attribute = token
value = token | quoted-string
Chunked-Body = *chunk
last-chunk
trailer
CRLF
chunk = chunk-size [ chunk-extension ] CRLF
chunk-data CRLF
chunk-size = 1*HEX
last-chunk = 1*("0") [ chunk-extension ] CRLF
chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] )
chunk-ext-name = token
chunk-ext-val = token | quoted-string
chunk-data = chunk-size(OCTET)
trailer = *(entity-header CRLF)
media-type = type "/" subtype *( ";" parameter )
type = token
subtype = token
product = token ["/" product-version]
product-version = token
qvalue = ( "0" [ "." 0*3DIGIT ] )
| ( "1" [ "." 0*3("0") ] )
language-tag = primary-tag *( "-" subtag )
primary-tag = 1*8ALPHA
subtag = 1*8ALPHA
entity-tag = [ weak ] opaque-tag
weak = "W/"
opaque-tag = quoted-string
range-unit = bytes-unit | other-range-unit
bytes-unit = "bytes"
other-range-unit = token
HTTP-message = Request | Response ; HTTP/1.1 messages
generic-message = start-line
*(message-header CRLF)
CRLF
[ message-body ]
start-line = Request-Line | Status-Line
message-header = field-name ":" [ field-value ]
field-name = token
field-value = *( field-content | LWS )
field-content = <the OCTETs making up the field-value
and consisting of either *TEXT or combinations
of token, separators, and quoted-string>
message-body = entity-body
| <entity-body encoded as per Transfer-Encoding>
general-header = Cache-Control ; Section 14.9
| Connection ; Section 14.10
| Date ; Section 14.18
| Pragma ; Section 14.32
| Trailer ; Section 14.40
| Transfer-Encoding ; Section 14.41
| Upgrade ; Section 14.42
| Via ; Section 14.45
| Warning ; Section 14.46
Request = Request-Line ; Section 5.1
*(( general-header ; Section 4.5
| request-header ; Section 5.3
| entity-header ) CRLF) ; Section 7.1
CRLF
[ message-body ] ; Section 4.3
Request-Line = Method SP Request-URI SP HTTP-Version CRLF
Method = "OPTIONS" ; Section 9.2
| "GET" ; Section 9.3
| "HEAD" ; Section 9.4
| "POST" ; Section 9.5
| "PUT" ; Section 9.6
| "DELETE" ; Section 9.7
| "TRACE" ; Section 9.8
| "CONNECT" ; Section 9.9
| extension-method
extension-method = token
Request-URI = "*"
| absoluteURI
| abs_path [ "?" query ]
| authority
request-header = Accept ; Section 14.1
| Accept-Charset ; Section 14.2
| Accept-Encoding ; Section 14.3
| Accept-Language ; Section 14.4
| Authorization ; Section 14.8
| Expect ; Section 14.20
| From ; Section 14.22
| Host ; Section 14.23
| If-Match ; Section 14.24
| If-Modified-Since ; Section 14.25
| If-None-Match ; Section 14.26
| If-Range ; Section 14.27
| If-Unmodified-Since ; Section 14.28
| Max-Forwards ; Section 14.31
| Proxy-Authorization ; Section 14.34
| Range ; Section 14.35
| Referer ; Section 14.36
| TE ; Section 14.39
| User-Agent ; Section 14.43
Response = Status-Line ; Section 6.1
*(( general-header ; Section 4.5
| response-header ; Section 6.2
| entity-header ) CRLF) ; Section 7.1
CRLF
[ message-body ] ; Section 7.2
Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF
Status-Code =
"100" ; Section 10.1.1: Continue
| "101" ; Section 10.1.2: Switching Protocols
| "200" ; Section 10.2.1: OK
| "201" ; Section 10.2.2: Created
| "202" ; Section 10.2.3: Accepted
| "203" ; Section 10.2.4: Non-Authoritative Information
| "204" ; Section 10.2.5: No Content
| "205" ; Section 10.2.6: Reset Content
| "206" ; Section 10.2.7: Partial Content
| "300" ; Section 10.3.1: Multiple Choices
| "301" ; Section 10.3.2: Moved Permanently
| "302" ; Section 10.3.3: Found
| "303" ; Section 10.3.4: See Other
| "304" ; Section 10.3.5: Not Modified
| "305" ; Section 10.3.6: Use Proxy
| "307" ; Section 10.3.8: Temporary Redirect
| "400" ; Section 10.4.1: Bad Request
| "401" ; Section 10.4.2: Unauthorized
| "402" ; Section 10.4.3: Payment Required
| "403" ; Section 10.4.4: Forbidden
| "404" ; Section 10.4.5: Not Found
| "405" ; Section 10.4.6: Method Not Allowed
| "406" ; Section 10.4.7: Not Acceptable
| "407" ; Section 10.4.8: Proxy Authentication Required
| "408" ; Section 10.4.9: Request Time-out
| "409" ; Section 10.4.10: Conflict
| "410" ; Section 10.4.11: Gone
| "411" ; Section 10.4.12: Length Required
| "412" ; Section 10.4.13: Precondition Failed
| "413" ; Section 10.4.14: Request Entity Too Large
| "414" ; Section 10.4.15: Request-URI Too Large
| "415" ; Section 10.4.16: Unsupported Media Type
| "416" ; Section 10.4.17: Requested range not satisfiable
| "417" ; Section 10.4.18: Expectation Failed
| "500" ; Section 10.5.1: Internal Server Error
| "501" ; Section 10.5.2: Not Implemented
| "502" ; Section 10.5.3: Bad Gateway
| "503" ; Section 10.5.4: Service Unavailable
| "504" ; Section 10.5.5: Gateway Time-out
| "505" ; Section 10.5.6: HTTP Version not supported
| extension-code
extension-code = 3DIGIT
Reason-Phrase = *<TEXT, excluding CR, LF>
response-header = Accept-Ranges ; Section 14.5
| Age ; Section 14.6
| ETag ; Section 14.19
| Location ; Section 14.30
| Proxy-Authenticate ; Section 14.33
| Retry-After ; Section 14.37
| Server ; Section 14.38
| Vary ; Section 14.44
| WWW-Authenticate ; Section 14.47
entity-header = Allow ; Section 14.7
| Content-Encoding ; Section 14.11
| Content-Language ; Section 14.12
| Content-Length ; Section 14.13
| Content-Location ; Section 14.14
| Content-MD5 ; Section 14.15
| Content-Range ; Section 14.16
| Content-Type ; Section 14.17
| Expires ; Section 14.21
| Last-Modified ; Section 14.29
| extension-header
extension-header = message-header
entity-body = *OCTET
Accept = "Accept" ":"
#( media-range [ accept-params ] )
media-range = ( "*/*"
| ( type "/" "*" )
| ( type "/" subtype )
) *( ";" parameter )
accept-params = ";" "q" "=" qvalue *( accept-extension )
accept-extension = ";" token [ "=" ( token | quoted-string ) ]
Accept-Charset = "Accept-Charset" ":"
1#( ( charset | "*" )[ ";" "q" "=" qvalue ] )
Accept-Encoding = "Accept-Encoding" ":"
1#( codings [ ";" "q" "=" qvalue ] )
codings = ( content-coding | "*" )
Accept-Language = "Accept-Language" ":"
1#( language-range [ ";" "q" "=" qvalue ] )
language-range = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) ) | "*" )
Accept-Ranges = "Accept-Ranges" ":" acceptable-ranges
acceptable-ranges = 1#range-unit | "none"
Age = "Age" ":" age-value
age-value = delta-seconds
Allow = "Allow" ":" #Method
Authorization = "Authorization" ":" credentials
Cache-Control = "Cache-Control" ":" 1#cache-directive
cache-directive = cache-request-directive
| cache-response-directive
cache-request-directive =
"no-cache" ; Section 14.9.1
| "no-store" ; Section 14.9.2
| "max-age" "=" delta-seconds ; Section 14.9.3, 14.9.4
| "max-stale" [ "=" delta-seconds ] ; Section 14.9.3
| "min-fresh" "=" delta-seconds ; Section 14.9.3
| "no-transform" ; Section 14.9.5
| "only-if-cached" ; Section 14.9.4
| cache-extension ; Section 14.9.6
cache-response-directive =
"public" ; Section 14.9.1
| "private" [ "=" <"> 1#field-name <"> ] ; Section 14.9.1
| "no-cache" [ "=" <"> 1#field-name <"> ]; Section 14.9.1
| "no-store" ; Section 14.9.2
| "no-transform" ; Section 14.9.5
| "must-revalidate" ; Section 14.9.4
| "proxy-revalidate" ; Section 14.9.4
| "max-age" "=" delta-seconds ; Section 14.9.3
| "s-maxage" "=" delta-seconds ; Section 14.9.3
| cache-extension ; Section 14.9.6
cache-extension = token [ "=" ( token | quoted-string ) ]
Connection = "Connection" ":" 1#(connection-token)
connection-token = token
Content-Encoding = "Content-Encoding" ":" 1#content-coding
Content-Language = "Content-Language" ":" 1#language-tag
Content-Length = "Content-Length" ":" 1*DIGIT
Content-Location = "Content-Location" ":"
( absoluteURI | relativeURI )
Content-MD5 = "Content-MD5" ":" md5-digest
md5-digest = <base64 of 128 bit MD5 digest as per [RFC1864]>
Content-Range = "Content-Range" ":" content-range-spec
content-range-spec = byte-content-range-spec
byte-content-range-spec = bytes-unit SP
byte-range-resp-spec "/"
( instance-length | "*" )
byte-range-resp-spec = (first-byte-pos "-" last-byte-pos)
| "*"
instance-length = 1*DIGIT
Content-Type = "Content-Type" ":" media-type
Date = "Date" ":" HTTP-date
ETag = "ETag" ":" entity-tag
Expect = "Expect" ":" 1#expectation
expectation = "100-continue" | expectation-extension
expectation-extension = token [ "=" ( token | quoted-string )
*expect-params ]
expect-params = ";" token [ "=" ( token | quoted-string ) ]
Expires = "Expires" ":" HTTP-date
From = "From" ":" mailbox
Host = "Host" ":" host [ ":" port ] ; Section 3.2.2
If-Match = "If-Match" ":" ( "*" | 1#entity-tag )
If-Modified-Since = "If-Modified-Since" ":" HTTP-date
If-None-Match = "If-None-Match" ":" ( "*" | 1#entity-tag )
If-Range = "If-Range" ":" ( entity-tag | HTTP-date )
If-Unmodified-Since = "If-Unmodified-Since" ":" HTTP-date
Last-Modified = "Last-Modified" ":" HTTP-date
Location = "Location" ":" absoluteURI [ "#" fragment ]
Max-Forwards = "Max-Forwards" ":" 1*DIGIT
Pragma = "Pragma" ":" 1#pragma-directive
pragma-directive = "no-cache" | extension-pragma
extension-pragma = token [ "=" ( token | quoted-string ) ]
Proxy-Authenticate = "Proxy-Authenticate" ":" 1#challenge
Proxy-Authorization = "Proxy-Authorization" ":" credentials
ranges-specifier = byte-ranges-specifier
byte-ranges-specifier = bytes-unit "=" byte-range-set
byte-range-set = 1#( byte-range-spec | suffix-byte-range-spec )
byte-range-spec = first-byte-pos "-" [last-byte-pos]
first-byte-pos = 1*DIGIT
last-byte-pos = 1*DIGIT
suffix-byte-range-spec = "-" suffix-length
suffix-length = 1*DIGIT
Range = "Range" ":" ranges-specifier
Referer = "Referer" ":" ( absoluteURI | relativeURI )
Retry-After = "Retry-After" ":" ( HTTP-date | delta-seconds )
Server = "Server" ":" 1*( product | comment )
TE = "TE" ":" #( t-codings )
t-codings = "trailers" | ( transfer-extension [ accept-params ] )
Trailer = "Trailer" ":" 1#field-name
Transfer-Encoding = "Transfer-Encoding" ":" 1#transfer-coding
Upgrade = "Upgrade" ":" 1#product
User-Agent = "User-Agent" ":" 1*( product | comment )
Vary = "Vary" ":" ( "*" | 1#field-name )
Via = "Via" ":" 1#( received-protocol received-by [ comment ] )
received-protocol = [ protocol-name "/" ] protocol-version
protocol-name = token
protocol-version = token
received-by = ( host [ ":" port ] ) | pseudonym
pseudonym = token
Warning = "Warning" ":" 1#warning-value
warning-value = warn-code SP warn-agent SP warn-text
[SP warn-date]
warn-code = 3DIGIT
warn-agent = ( host [ ":" port ] ) | pseudonym
; the name or pseudonym of the server adding
; the Warning header, for use in debugging
warn-text = quoted-string
warn-date = <"> HTTP-date <">
WWW-Authenticate = "WWW-Authenticate" ":" 1#challenge
MIME-Version = "MIME-Version" ":" 1*DIGIT "." 1*DIGIT
content-disposition = "Content-Disposition" ":"
disposition-type *( ";" disposition-parm )
disposition-type = "attachment" | disp-extension-token
disposition-parm = filename-parm | disp-extension-parm
filename-parm = "filename" "=" quoted-string
disp-extension-token = token
disp-extension-parm = token "=" ( token | quoted-string )
Received on Saturday, 17 March 2007 08:49:25 UTC