- From: Roy T. Fielding <fielding@avron.ics.uci.edu>
- Date: Fri, 14 Apr 1995 01:39:16 -0700
- To: paulp@cerf.net
- Cc: Multiple recipients of list <www-talk@www10.w3.org>
> I have a serious problem with the Common Logfile format, as presented at > <URL:http://w3.org/hypertext/WWW/Daemon/User/Config/Logging.html# > common-logfile-format>. It indicates that the "request" portion of the > log entry should be: > > The request line exactly as it came from the client. Yes -- that is what a log is for. > Unfortunately with directory indexing, this means that three different > requests all have the same semantic meaning: > > GET /dirname > GET /dirname/ > GET /dirname/index.html > > (Assuming that index.html is the dir index file, this too can vary.) Are > the current logfile processing programs taking this vagarity into > account? Yes, it is a trivial thing to do -- wwwstat has done it since v0.1. > I intend to log > > GET /dirname/index.html > > in all cases where index.html existed, and > > GET /dirname/ > > in all cases where it doesn't, unless somebody can provide me with a > really good reason not to. Reason: it would by lying -- that is not the request it got, so it shouldn't be logging it as if it was. For instance, I am usually interested in cases where there are a large number of requests for GET /dirname since that usually means somebody has advertized (or included as a link) the wrong URL for that dirname. Your scheme would prevent me from finding those cases in the logfile. > One of the features of the server I am > writing will be reliable logging, so this is a little more important than > it might sound. In that case, don't do it -- you just introduced an unreliability. If the server mucks with the request, I can't rely on it for maintenance and security checks. ....Roy T. Fielding Department of ICS, University of California, Irvine USA <fielding@ics.uci.edu> <URL:http://www.ics.uci.edu/dir/grad/Software/fielding>
Received on Friday, 14 April 1995 04:41:25 UTC