- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Sat, 17 Mar 2007 09:49:16 +0100
- To: HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <45FBAB8C.6010809@gmx.de>
Hi, here are some thoughts for Issue 36, the upgrade of the ABNF syntax (see </projects/w3c/WWW/Protocols/HTTP/1.1/rfc2616bis>): 1) I think there is agreement that the ABNF syntax needs to be upgraded to the syntax defined in RFC4234. 2) What's not so clear is what exact format we want to use. Currently, RFC2616 uses a format based on RFC822 (see <http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.2.1>)m but augments it with a special "list" syntax, which makes the notation for many headers much easier to read (see <http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.2.1.p.9>). We need to decide whether we want to keep that extension in RFC2616bis, or just live with the RFC4234 syntax. 3) Related to that, RFC2616 defines special treatment of Linear White Space (LWS), see <http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.2.1.p.11>). It's up for discussion whether we want to keep that aspect. See <http://lists.w3.org/Archives/Public/ietf-http-wg/2007JanMar/0116.html> for some comments from Roy Fielding about that subject. So how do we proceed? I think it would be extremely useful to be able to extract & parse the embedded ABNF productions, generating somewhat normalized output. That would allow us to ensure that while we are updating the spec, we don't break the grammar (or actually, that we do not introduce *unintented* changes). Extraction is no problem. The XML source (<http://www.w3.org/Protocols/HTTP/1.1/rfc2616bis/draft-lafon-rfc2616bis-latest.xml>) already has the ABNF fragments marked up accordingly. A tool for extraction is available in my rfc2629.xslt package (<http://greenbytes.de/tech/webdav/rfc2629xslt/rfc2629xslt.html#extract-artwork>) -- I'm attaching the output. The next step would be to run this through a parser. Here's where we have the first problems: - we'd need a parser for the RFC2616 variant of the ABNF syntax (does anybody have that?), and - the ABNF itself must parse. Currently there's at least one problem with qdtext = <any TEXT except <">> which IMHO needs to be fixed not to contain the literals "<" and ">". I would like to make that change with the next draft we submit. Optimally, we would then find a parser that will accept both rfc2616-style ABNF and RFC4234-style ABNF, and which will re-serialize into a single RFC4234-based format that we can then use to verify changes. Best regards, Julian
OCTET = <any 8-bit sequence of data> CHAR = <any US-ASCII character (octets 0 - 127)> UPALPHA = <any US-ASCII uppercase letter "A".."Z"> LOALPHA = <any US-ASCII lowercase letter "a".."z"> ALPHA = UPALPHA | LOALPHA DIGIT = <any US-ASCII digit "0".."9"> CTL = <any US-ASCII control character (octets 0 - 31) and DEL (127)> CR = <US-ASCII CR, carriage return (13)> LF = <US-ASCII LF, linefeed (10)> SP = <US-ASCII SP, space (32)> HT = <US-ASCII HT, horizontal-tab (9)> <"> = <US-ASCII double-quote mark (34)> CRLF = CR LF LWS = [CRLF] 1*( SP | HT ) TEXT = <any OCTET except CTLs, but including LWS> HEX = "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT token = 1*<any CHAR except CTLs or separators> separators = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" | "\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" | "}" | SP | HT comment = "(" *( ctext | quoted-pair | comment ) ")" ctext = <any TEXT excluding "(" and ")"> quoted-string = ( <"> *(qdtext | quoted-pair ) <"> ) qdtext = <any TEXT except <">> quoted-pair = "\" CHAR HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]] HTTP-date = rfc1123-date | rfc850-date | asctime-date rfc1123-date = wkday "," SP date1 SP time SP "GMT" rfc850-date = weekday "," SP date2 SP time SP "GMT" asctime-date = wkday SP date3 SP time SP 4DIGIT date1 = 2DIGIT SP month SP 4DIGIT ; day month year (e.g., 02 Jun 1982) date2 = 2DIGIT "-" month "-" 2DIGIT ; day-month-year (e.g., 02-Jun-82) date3 = month SP ( 2DIGIT | ( SP 1DIGIT )) ; month day (e.g., Jun 2) time = 2DIGIT ":" 2DIGIT ":" 2DIGIT ; 00:00:00 - 23:59:59 wkday = "Mon" | "Tue" | "Wed" | "Thu" | "Fri" | "Sat" | "Sun" weekday = "Monday" | "Tuesday" | "Wednesday" | "Thursday" | "Friday" | "Saturday" | "Sunday" month = "Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun" | "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec" delta-seconds = 1*DIGIT charset = token content-coding = token transfer-coding = "chunked" | transfer-extension transfer-extension = token *( ";" parameter ) parameter = attribute "=" value attribute = token value = token | quoted-string Chunked-Body = *chunk last-chunk trailer CRLF chunk = chunk-size [ chunk-extension ] CRLF chunk-data CRLF chunk-size = 1*HEX last-chunk = 1*("0") [ chunk-extension ] CRLF chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] ) chunk-ext-name = token chunk-ext-val = token | quoted-string chunk-data = chunk-size(OCTET) trailer = *(entity-header CRLF) media-type = type "/" subtype *( ";" parameter ) type = token subtype = token product = token ["/" product-version] product-version = token qvalue = ( "0" [ "." 0*3DIGIT ] ) | ( "1" [ "." 0*3("0") ] ) language-tag = primary-tag *( "-" subtag ) primary-tag = 1*8ALPHA subtag = 1*8ALPHA entity-tag = [ weak ] opaque-tag weak = "W/" opaque-tag = quoted-string range-unit = bytes-unit | other-range-unit bytes-unit = "bytes" other-range-unit = token HTTP-message = Request | Response ; HTTP/1.1 messages generic-message = start-line *(message-header CRLF) CRLF [ message-body ] start-line = Request-Line | Status-Line message-header = field-name ":" [ field-value ] field-name = token field-value = *( field-content | LWS ) field-content = <the OCTETs making up the field-value and consisting of either *TEXT or combinations of token, separators, and quoted-string> message-body = entity-body | <entity-body encoded as per Transfer-Encoding> general-header = Cache-Control ; Section 14.9 | Connection ; Section 14.10 | Date ; Section 14.18 | Pragma ; Section 14.32 | Trailer ; Section 14.40 | Transfer-Encoding ; Section 14.41 | Upgrade ; Section 14.42 | Via ; Section 14.45 | Warning ; Section 14.46 Request = Request-Line ; Section 5.1 *(( general-header ; Section 4.5 | request-header ; Section 5.3 | entity-header ) CRLF) ; Section 7.1 CRLF [ message-body ] ; Section 4.3 Request-Line = Method SP Request-URI SP HTTP-Version CRLF Method = "OPTIONS" ; Section 9.2 | "GET" ; Section 9.3 | "HEAD" ; Section 9.4 | "POST" ; Section 9.5 | "PUT" ; Section 9.6 | "DELETE" ; Section 9.7 | "TRACE" ; Section 9.8 | "CONNECT" ; Section 9.9 | extension-method extension-method = token Request-URI = "*" | absoluteURI | abs_path [ "?" query ] | authority request-header = Accept ; Section 14.1 | Accept-Charset ; Section 14.2 | Accept-Encoding ; Section 14.3 | Accept-Language ; Section 14.4 | Authorization ; Section 14.8 | Expect ; Section 14.20 | From ; Section 14.22 | Host ; Section 14.23 | If-Match ; Section 14.24 | If-Modified-Since ; Section 14.25 | If-None-Match ; Section 14.26 | If-Range ; Section 14.27 | If-Unmodified-Since ; Section 14.28 | Max-Forwards ; Section 14.31 | Proxy-Authorization ; Section 14.34 | Range ; Section 14.35 | Referer ; Section 14.36 | TE ; Section 14.39 | User-Agent ; Section 14.43 Response = Status-Line ; Section 6.1 *(( general-header ; Section 4.5 | response-header ; Section 6.2 | entity-header ) CRLF) ; Section 7.1 CRLF [ message-body ] ; Section 7.2 Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF Status-Code = "100" ; Section 10.1.1: Continue | "101" ; Section 10.1.2: Switching Protocols | "200" ; Section 10.2.1: OK | "201" ; Section 10.2.2: Created | "202" ; Section 10.2.3: Accepted | "203" ; Section 10.2.4: Non-Authoritative Information | "204" ; Section 10.2.5: No Content | "205" ; Section 10.2.6: Reset Content | "206" ; Section 10.2.7: Partial Content | "300" ; Section 10.3.1: Multiple Choices | "301" ; Section 10.3.2: Moved Permanently | "302" ; Section 10.3.3: Found | "303" ; Section 10.3.4: See Other | "304" ; Section 10.3.5: Not Modified | "305" ; Section 10.3.6: Use Proxy | "307" ; Section 10.3.8: Temporary Redirect | "400" ; Section 10.4.1: Bad Request | "401" ; Section 10.4.2: Unauthorized | "402" ; Section 10.4.3: Payment Required | "403" ; Section 10.4.4: Forbidden | "404" ; Section 10.4.5: Not Found | "405" ; Section 10.4.6: Method Not Allowed | "406" ; Section 10.4.7: Not Acceptable | "407" ; Section 10.4.8: Proxy Authentication Required | "408" ; Section 10.4.9: Request Time-out | "409" ; Section 10.4.10: Conflict | "410" ; Section 10.4.11: Gone | "411" ; Section 10.4.12: Length Required | "412" ; Section 10.4.13: Precondition Failed | "413" ; Section 10.4.14: Request Entity Too Large | "414" ; Section 10.4.15: Request-URI Too Large | "415" ; Section 10.4.16: Unsupported Media Type | "416" ; Section 10.4.17: Requested range not satisfiable | "417" ; Section 10.4.18: Expectation Failed | "500" ; Section 10.5.1: Internal Server Error | "501" ; Section 10.5.2: Not Implemented | "502" ; Section 10.5.3: Bad Gateway | "503" ; Section 10.5.4: Service Unavailable | "504" ; Section 10.5.5: Gateway Time-out | "505" ; Section 10.5.6: HTTP Version not supported | extension-code extension-code = 3DIGIT Reason-Phrase = *<TEXT, excluding CR, LF> response-header = Accept-Ranges ; Section 14.5 | Age ; Section 14.6 | ETag ; Section 14.19 | Location ; Section 14.30 | Proxy-Authenticate ; Section 14.33 | Retry-After ; Section 14.37 | Server ; Section 14.38 | Vary ; Section 14.44 | WWW-Authenticate ; Section 14.47 entity-header = Allow ; Section 14.7 | Content-Encoding ; Section 14.11 | Content-Language ; Section 14.12 | Content-Length ; Section 14.13 | Content-Location ; Section 14.14 | Content-MD5 ; Section 14.15 | Content-Range ; Section 14.16 | Content-Type ; Section 14.17 | Expires ; Section 14.21 | Last-Modified ; Section 14.29 | extension-header extension-header = message-header entity-body = *OCTET Accept = "Accept" ":" #( media-range [ accept-params ] ) media-range = ( "*/*" | ( type "/" "*" ) | ( type "/" subtype ) ) *( ";" parameter ) accept-params = ";" "q" "=" qvalue *( accept-extension ) accept-extension = ";" token [ "=" ( token | quoted-string ) ] Accept-Charset = "Accept-Charset" ":" 1#( ( charset | "*" )[ ";" "q" "=" qvalue ] ) Accept-Encoding = "Accept-Encoding" ":" 1#( codings [ ";" "q" "=" qvalue ] ) codings = ( content-coding | "*" ) Accept-Language = "Accept-Language" ":" 1#( language-range [ ";" "q" "=" qvalue ] ) language-range = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) ) | "*" ) Accept-Ranges = "Accept-Ranges" ":" acceptable-ranges acceptable-ranges = 1#range-unit | "none" Age = "Age" ":" age-value age-value = delta-seconds Allow = "Allow" ":" #Method Authorization = "Authorization" ":" credentials Cache-Control = "Cache-Control" ":" 1#cache-directive cache-directive = cache-request-directive | cache-response-directive cache-request-directive = "no-cache" ; Section 14.9.1 | "no-store" ; Section 14.9.2 | "max-age" "=" delta-seconds ; Section 14.9.3, 14.9.4 | "max-stale" [ "=" delta-seconds ] ; Section 14.9.3 | "min-fresh" "=" delta-seconds ; Section 14.9.3 | "no-transform" ; Section 14.9.5 | "only-if-cached" ; Section 14.9.4 | cache-extension ; Section 14.9.6 cache-response-directive = "public" ; Section 14.9.1 | "private" [ "=" <"> 1#field-name <"> ] ; Section 14.9.1 | "no-cache" [ "=" <"> 1#field-name <"> ]; Section 14.9.1 | "no-store" ; Section 14.9.2 | "no-transform" ; Section 14.9.5 | "must-revalidate" ; Section 14.9.4 | "proxy-revalidate" ; Section 14.9.4 | "max-age" "=" delta-seconds ; Section 14.9.3 | "s-maxage" "=" delta-seconds ; Section 14.9.3 | cache-extension ; Section 14.9.6 cache-extension = token [ "=" ( token | quoted-string ) ] Connection = "Connection" ":" 1#(connection-token) connection-token = token Content-Encoding = "Content-Encoding" ":" 1#content-coding Content-Language = "Content-Language" ":" 1#language-tag Content-Length = "Content-Length" ":" 1*DIGIT Content-Location = "Content-Location" ":" ( absoluteURI | relativeURI ) Content-MD5 = "Content-MD5" ":" md5-digest md5-digest = <base64 of 128 bit MD5 digest as per [RFC1864]> Content-Range = "Content-Range" ":" content-range-spec content-range-spec = byte-content-range-spec byte-content-range-spec = bytes-unit SP byte-range-resp-spec "/" ( instance-length | "*" ) byte-range-resp-spec = (first-byte-pos "-" last-byte-pos) | "*" instance-length = 1*DIGIT Content-Type = "Content-Type" ":" media-type Date = "Date" ":" HTTP-date ETag = "ETag" ":" entity-tag Expect = "Expect" ":" 1#expectation expectation = "100-continue" | expectation-extension expectation-extension = token [ "=" ( token | quoted-string ) *expect-params ] expect-params = ";" token [ "=" ( token | quoted-string ) ] Expires = "Expires" ":" HTTP-date From = "From" ":" mailbox Host = "Host" ":" host [ ":" port ] ; Section 3.2.2 If-Match = "If-Match" ":" ( "*" | 1#entity-tag ) If-Modified-Since = "If-Modified-Since" ":" HTTP-date If-None-Match = "If-None-Match" ":" ( "*" | 1#entity-tag ) If-Range = "If-Range" ":" ( entity-tag | HTTP-date ) If-Unmodified-Since = "If-Unmodified-Since" ":" HTTP-date Last-Modified = "Last-Modified" ":" HTTP-date Location = "Location" ":" absoluteURI [ "#" fragment ] Max-Forwards = "Max-Forwards" ":" 1*DIGIT Pragma = "Pragma" ":" 1#pragma-directive pragma-directive = "no-cache" | extension-pragma extension-pragma = token [ "=" ( token | quoted-string ) ] Proxy-Authenticate = "Proxy-Authenticate" ":" 1#challenge Proxy-Authorization = "Proxy-Authorization" ":" credentials ranges-specifier = byte-ranges-specifier byte-ranges-specifier = bytes-unit "=" byte-range-set byte-range-set = 1#( byte-range-spec | suffix-byte-range-spec ) byte-range-spec = first-byte-pos "-" [last-byte-pos] first-byte-pos = 1*DIGIT last-byte-pos = 1*DIGIT suffix-byte-range-spec = "-" suffix-length suffix-length = 1*DIGIT Range = "Range" ":" ranges-specifier Referer = "Referer" ":" ( absoluteURI | relativeURI ) Retry-After = "Retry-After" ":" ( HTTP-date | delta-seconds ) Server = "Server" ":" 1*( product | comment ) TE = "TE" ":" #( t-codings ) t-codings = "trailers" | ( transfer-extension [ accept-params ] ) Trailer = "Trailer" ":" 1#field-name Transfer-Encoding = "Transfer-Encoding" ":" 1#transfer-coding Upgrade = "Upgrade" ":" 1#product User-Agent = "User-Agent" ":" 1*( product | comment ) Vary = "Vary" ":" ( "*" | 1#field-name ) Via = "Via" ":" 1#( received-protocol received-by [ comment ] ) received-protocol = [ protocol-name "/" ] protocol-version protocol-name = token protocol-version = token received-by = ( host [ ":" port ] ) | pseudonym pseudonym = token Warning = "Warning" ":" 1#warning-value warning-value = warn-code SP warn-agent SP warn-text [SP warn-date] warn-code = 3DIGIT warn-agent = ( host [ ":" port ] ) | pseudonym ; the name or pseudonym of the server adding ; the Warning header, for use in debugging warn-text = quoted-string warn-date = <"> HTTP-date <"> WWW-Authenticate = "WWW-Authenticate" ":" 1#challenge MIME-Version = "MIME-Version" ":" 1*DIGIT "." 1*DIGIT content-disposition = "Content-Disposition" ":" disposition-type *( ";" disposition-parm ) disposition-type = "attachment" | disp-extension-token disposition-parm = filename-parm | disp-extension-parm filename-parm = "filename" "=" quoted-string disp-extension-token = token disp-extension-parm = token "=" ( token | quoted-string )
Received on Saturday, 17 March 2007 08:49:25 UTC