URLLEN: consensus

This is a replacement for section 3.2.1 on the syntax of URUs. It adds a
paragraph to clarify the issue of URL length. (Changes are marked with
change bars.)

This issue was discussed on the list, and this is believed to represent
the consensus on this issue. If you believe otherwise, please let me
know; otherwise, this issue will be closed.
-------------------------

3.2.1 General Syntax

   URIs in HTTP can be represented in absolute form or relative to 
   some known base URI [11], depending upon the context of their use. 
   The two forms are differentiated by the fact that absolute URIs 
   always begin with a scheme name followed by a colon.

       URI            = ( absoluteURI | relativeURI ) [ "#" fragment ]

       absoluteURI    = scheme ":" *( uchar | reserved )

       relativeURI    = net_path | abs_path | rel_path

       net_path       = "//" net_loc [ abs_path ]
       abs_path       = "/" rel_path
       rel_path       = [ path ] [ ";" params ] [ "?" query ]

       path           = fsegment *( "/" segment )
       fsegment       = 1*pchar
       segment        = *pchar

       params         = param *( ";" param )
       param          = *( pchar | "/" )

       scheme         = 1*( ALPHA | DIGIT | "+" | "-" | "." )
       net_loc        = *( pchar | ";" | "?" )
       query          = *( uchar | reserved )
       fragment       = *( uchar | reserved )

       pchar          = uchar | ":" | "@" | "&" | "="
       uchar          = unreserved | escape
       unreserved     = ALPHA | DIGIT | safe | extra | national

       escape         = "%" HEX HEX
       reserved       = ";" | "/" | "?" | ":" | "@" | "&" | "="
       extra          = "!" | "*" | "'" | "(" | ")" | ","
       safe           = "$" | "-" | "_" | "." | "+"
       unsafe         = CTL | SP | <"> | "#" | "%" | "<" | ">"
       national       = <any OCTET excluding ALPHA, DIGIT,
                        reserved, extra, safe, and unsafe>

   For definitive information on URL syntax and semantics, see RFC 
   1738 [4] and RFC 1808 [11]. The BNF above includes national 
   characters not allowed in valid URLs as specified by RFC 1738, 
   since HTTP servers are not restricted in the set of unreserved 
   characters allowed to represent the rel_path part of addresses, and 
   HTTP proxies may receive requests for URIs not defined by RFC 1738.

|   The HTTP protocol does not place any a-priori limit on the length of
a URI.
|   Servers MUST be able to handle the URI of any resource they serve,
|   and SHOULD be able to handle URIs of unbounded length if they
provide
|   GET-based forms that could generate such URIs. A server SHOULD
return
|   a status code of 
|	414 Request-URI Too Large
|   if a URI is longer than the server can handle.
|	
|	Note:
|	   Servers should be cautious about depending on URI lengths above
|  	   255 bytes, because some older client or proxy implementations may
|	   not properly support these.
|
|   All client and proxy implementations MUST be able to handle a URI
|   of any finite length. 


----------------------------------------------------
Paul J. Leach            Email: paulle@microsoft.com
Microsoft                Phone: 1-206-882-8080
1 Microsoft Way          Fax:   1-206-936-7329
Redmond, WA 98052

Received on Monday, 1 April 1996 03:02:02 UTC