- From: Foteos Macrides <MACRIDES@sci.wfbr.edu>
- Date: Tue, 23 Sep 1997 13:59:10 -0500 (EST)
- To: dmk@research.bell-labs.com
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com, http-state@research.bell-labs.com
dmk@research.bell-labs.com (Dave Kristol) wrote: >Gisle Aas <aas@bergen.sn.no> wrote: > >> dmk@research.bell-labs.com (Dave Kristol) writes: >> >> > Gisle Aas <aas@bergen.sn.no> wrote on 19 Sep 1997 11:59:27 +0200: >> > >> > > what I have done now is to let URL-encoded chars and unencoded chars >> > > match and then let "%2F" and "/" be the exception. Perhaps ";" should >> > > be special too? >> > >> > I'm not sure I understand what you're saying. It sounds like you're saying >> > the % escapes wouldn't be interpreted, except sometimes. So >> > Path="/%66oo" >> > and Path="/foo" >> > would be different, but >> > Path="/foo%2F" >> > and Path="/foo/" >> > would be the same. >> >> No, I meant the opposite. "%2F" denotes a literal slash in a path >> segment. "/" is the path segment separator. They are not the same >> thing and I don't think they should match (but "%2f" and "%2F" is the >> same thing). "%66" and "f" is encodings for the same unreserved >> character, so they match. >> >> draft-fielding-url-syntax-07.txt says: >> >> | 2.3. Unreserved Characters >> | >> | Data characters which are allowed in a URL but do not have a reserved >> | purpose are called unreserved. These include upper and lower case >> | letters, decimal digits, and a limited set of punctuation marks and >> | symbols. >> [...] >> | 4.3.2. Path Component >> | >> | The path component contains data, specific to the site (or the scheme >> | if there is no site component), identifying the resource within the >> | scope of that scheme and site. >> | >> | path = [ "/" ] path_segments >> | >> | path_segments = segment *( "/" segment ) >> | segment = *pchar *( ";" param ) >> | param = *pchar >> | >> | pchar = unreserved | escaped | ":" | "@" | "&" | "=" | "+" >> [...] >> Which might suggest that "/", ";", "=", and "?" should not match their >> URL-encoded versions. >> >> [DMK's inappropriate citation of RFC 822 deleted] > >Okay, I see your point now. I think what you're saying, in fact, is >any character that is not a pchar must be escaped, and that an explicit >non-pchar does not match its escaped form. > >So now we need words to say that. How about these, for 4.3.4, under >Path Selection: > >Two paths match if > - respective characters match exactly; or > - respective characters are represented by % escapes, and the values > of the % escapes are equal; or > - one path has an explicit character, the second path has a % escape, > the % escape evaluates to the character, and the character > is a pchar (RFC 2068). >Otherwise the paths do not match. You're getting into the can of worms with which the MHTML-WG has been struggling, and would be unwise, IHMO, to take it on explicily in the cookie specs as well, rather than leaving it as now, with an implicit definition of "match" as "whatever the current HTTP specs define that to mean for the path field of http[s] URLs (presently in Section 3.2.3 of RFC 2068 and the -08 revision draft)". Technically, semi-colons should be hex escaped when they are not parameter delimiters, but a URL parameter presently is defined only for ftp URLs (;type=[A, I, or D]) and virutally all deployed UAs treat any unescaped semi-colons "in"/"near" http[s] URL path fields as ordinary characters belonging to the immediately preceding element. Note also, that Section 3.2.3, through the -08 draft, specifies parameter handling in http URLs according to RFC 1738 and RFC 1808, but the -07 URL draft (as in its preceding -0n versions) has changed that in relation to "current practice", i.e., it now describes the apparant resolving behavior of deployed UAs which in fact are not paying any attention to semi-colons as parameter delimiters in http[s] URLs (because there aren't any parameters defined for that scheme :). Presumeably, the HTTP/1.1 draft's Section 3.2.3 will be updated homologously before it moves on to Draft Standard. The -07 URL draft also has wording changes to make clear that URLs are sequences of characters, not necessarily octets, in anticipation of explicit standards for internationalized URLs. When the latter standards are forthcoming (perhaps with parameters to indicate charsets) the matching and parameter handling rules might need further revisions. So, again, it MAY be better to handle the issue of path matching via indirect reference to HTTP specs, rather than trying to spell it out explicitly within the cookie specs. :) Fote ========================================================================= Foteos Macrides Worcester Foundation for Biomedical Research MACRIDES@SCI.WFBR.EDU 222 Maple Avenue, Shrewsbury, MA 01545 =========================================================================
Received on Tuesday, 23 September 1997 11:07:27 UTC