- From: Roy T. Fielding <fielding@liege.ICS.UCI.EDU>
- Date: Mon, 21 Oct 1996 21:54:55 -0700
- To: Francois Pottier <Francois.Pottier@inria.fr>
- cc: www-talk@w3.org
> I'm new to this list. I am developing a shareware link checker for the > Macintosh called Big Brother. I have just implemented a URL parser which > attempts to follow exactly the definition given in RFC 1808. > > The problem is, according to these rules, the following URL is invalid: > > http://pauillac.inria.fr/~fpottier/ > > because the tilde (~) character is not allowed in path segments. Yet > tilde characters are very common in everyday use on the Web, so I must > assume that RFC 1808 either contains typos or is out of date. Can anyone > point me to a correct and precise definition of the URL syntax? Actually, neither is the case -- tilde was a very common character in URLs when I wrote RFC 1808, just as it was when RFC 1738 was written. You see, there's this strange tension between "what we would like to be a standard" and "what was actually implemented", with the latter winning hands-down. The tilde character was originally (long ago, by TimBL) outlawed in URLs, since it was difficult (if not impossible) to type on some international keyboards, and being able to transcribe a URL from a bar napkin is the primary discriminator between good and bad characters for URLs. Unfortunately, Rob McCool (original developer of NCSA httpd and general webgod), didn't know that the tilde was outlawed when he implemented user public_html directories, and chose it as the most obvious default indicator of such. Once the cat was out of the bag, no standard could stuff it back in. Both RFC 1738 and RFC 1808 are now out-of-date and need to be revised, because proposed standards need to be revised to reflect the actual implementations that exist. However, they don't revise themselves, and both Larry and I have been overwhelmed with other work for a long time. In the mean time, you should note that the HTTP/1.1 spec contains a better grammar for parsing URLs. ...Roy T. Fielding Department of Information & Computer Science (fielding@ics.uci.edu) University of California, Irvine, CA 92697-3425 fax:+1(714)824-4056 http://www.ics.uci.edu/~fielding/
Received on Tuesday, 22 October 1996 00:57:07 UTC