- From: Foteos Macrides <MACRIDES@sci.wfbr.edu>
- Date: Mon, 28 Apr 1997 11:22:10 -0500 (EST)
- To: masinter@parc.xerox.com
- Cc: fielding@kiwi.ics.uci.edu, uri@bunyip.com
Larry Masinter <masinter@parc.xerox.com> wrote: >The enormous flap over internationalization left me with little >time to deal with other issues. I don't quite understand where >we want to go with some of the other issues. > >> The rules for resolving partial/relative URLs since the >> beginning of URL time have been such that if relative symbolic >> elements end up at the beginning of paths they should be retained, >> e.g., you can end up with something like: >> >> http://host/../foo/blah.html >> >> but Netscape's parsing ends up stripping lead relative symbolic >> elements yielding: >> >> http://host/foo/blah.html >> >> with the consequence that many people are putting HREFs and SRCs >> in their markup which by "valid" parsing rules yield lead >> relative symbolic elements, and sending of "false bug reports" >> to non-Netscape browser developers with one or another variant >> of: >> >> "It works fine with Netscape." >> >> I can see retaining the lead relative symbolic elements >> in ftp URLs for personal accounts (would generally fail for >> anonymous accounts), but to my knowledge no http or https server >> would accept such paths, so there's that kind of justification >> what Netscape is doing. >> >> I would appreciate your and others' opinions on whether >> it would be good or bad for other browsers to reverse engineer >> for that Netscape URL resolving. >> >> Fote > >Was there any resolution of this issue? Since posting that question, I've received feedback that the current versions of most browsers, not just Netscape and the current version of MSIE, trim a lead relative symbolic element in the paths for http/https requests. My own predisposition, though, is to leave the generic parsing rules as they presently are in the draft for that matter, and treat the http/https problem as a implementation issue, or as a special case for http/https (homologously to how things now stand for the lead slash in ftp URLs). I tried leaving the formal parsing functions in Lynx compliant with RFC1808 and/or the draft, but adding a 3 or so line hack of the stream parser which, when a partial HREF or SRC value is "..", "../", or "../whatever", checks whether the base has an http or https scheme and lacks a path (i.e., has just the default '/'), and if so makes those absolute, "doctored" paths ("/" or "/whatever") before passing the base and HREF or SRC value to the formal parsing functions. In conjunction with that "doctoring", Lynx issues a statusline message about the bad partial reference, so that there is immediate feedback about it which might lead to it being corrected. HREF or SRC values such as "./../whatever" or "../../whatever", and values with absolute paths or URLs that include lead relative symbolic elements, would still end up with them after formal parsing/resolving. This selective "doctoring" seems to deal with all the bad partial references which have become common in documents during the past year, I suspect, as someone else suggested, due to a bug in some authoring tool, and thus it seems best simply to track down that tool and suggest that the bug be corrected. I don't know if this "doctoring" hack will be incorporated into a formal Lynx release, or if we'll just keep letting the resolved http/https URLs fail, but the formal parsing functions won't be changed unless the draft's rules are changed before it becames an RFC. It's been made clear the new URL-WG is only for discussions about the PROCESS of approving drafts for URLs, but no one has yet answered the recurring question, most recently from Dan Connolly, about which is(are) the proper forum(s) for discussing the SUBSTANCE of the drafts. It can be inferred that it's this forum of the disbanded URI-WG, but could you or someone please answer the latter question explicitly? Note that the www.alis.com server does accept /../foo.html and /foo.html requests as equivalent, i.e., as if a "map /../* /*" rule has been added to its configuration file, perhaps to "help" browsers which don't strip it themselves. It's hard to image that the server is sending such a path in the browser's request directly to the filesystem, because, for most server's, the "root" for its "data tree" need not be a filesystem root, and this would pose a serious security problem (as would /~user/../whatever). The consequence of this "help", if that's what it is, unfortunately is that the lead relative symbolic element ends up retained in all the subsequently resolved URLs, is thus likely to be seen by the user in some window for displaying URLs, end up in bookmarks, etc., and in the long run exacerbate the problem rather than "helping", IMHO. Fote ========================================================================= Foteos Macrides Worcester Foundation for Biomedical Research MACRIDES@SCI.WFBR.EDU 222 Maple Avenue, Shrewsbury, MA 01545 =========================================================================
Received on Monday, 28 April 1997 11:23:39 UTC