Re: Proposition on advanced URL features (Is # illegal)?

Roy T. Fielding (fielding@avron.ICS.UCI.EDU)
Thu, 30 Nov 1995 21:12:33 -0800


To: Mirsad Todorovac <tm@rasips1.rasip.etf.hr>
Cc: jim@eies.njit.edu, uri@bunyip.com, www-talk@w3.org
Subject: Re: Proposition on advanced URL features (Is # illegal)? 
In-Reply-To: Your message of "Wed, 29 Nov 1995 11:12:36 +0100."
             <199511291012.LAA27469@rasips1.rasip.etf.hr> 
Date: Thu, 30 Nov 1995 21:12:33 -0800
From: "Roy T. Fielding" <fielding@avron.ICS.UCI.EDU>
Message-Id:  <9511302112.aa07271@paris.ics.uci.edu>

>> > 1.  The use of ## for special anchors seems reasonable.
>> 
>> Use of more than one "#" character is illegal and not desirable
>> in the current URI syntax.
> 
> It's an interresting point here.  Let's see this quote from RFC 1808
> (by R. Fielding):
> 
> |2.4.1.  Parsing the Fragment Identifier
> | 
> |   If the parse string contains a crosshatch "#" character, then the
> |   substring after the first (left-most) crosshatch "#" and up to the
> |   end of the parse string is the <fragment> identifier.  If the
> |   crosshatch is the last character, or no crosshatch is present, then  
> |   the fragment identifier is empty.  The matched substring, including  
> |   the crosshatch character, is removed from the parse string before
> |   continuing.
> | 
> |   Note that the fragment identifier is not considered part of the URL.
> |   However, since it is often attached to the URL, parsers must be able
> |   to recognize and set aside fragment identifiers as part of the 
> |   process.
> | 
> 
> It states clearly 'the first (left-most) crosshatch "#" and up to the
> end of the parse string is the <fragment> identifier'.  This _does_ imply
> that there are more '#' characters than one ... Why say ``leftmost "#"
> character'' if there is only one allowed ? -- Mirsad

Because I believe in robust parsing.  Look at the BNF (also in RFC 1808).
There is no conflict between the two, and the BNF does not allow "#"
anywhere but immediately preceding the fragment.  Some would call this
weasel wording, but I call it good design.  ;-)

 ...Roy T. Fielding
    Department of Information & Computer Science    (fielding@ics.uci.edu)
    University of California, Irvine, CA 92717-3425    fax:+1(714)824-4056
    http://www.ics.uci.edu/~fielding/