W3C home > Mailing lists > Public > www-talk@w3.org > March to April 1995

Re: Common Log format

From: Roy T. Fielding <fielding@avron.ics.uci.edu>
Date: Fri, 14 Apr 1995 01:39:16 -0700
To: paulp@cerf.net
Cc: Multiple recipients of list <www-talk@www10.w3.org>
Message-Id: <9504140139.aa08834@paris.ics.uci.edu>
> I have a serious problem with the Common Logfile format, as presented at 
> <URL:http://w3.org/hypertext/WWW/Daemon/User/Config/Logging.html#
> common-logfile-format>.  It indicates that the "request" portion of the 
> log entry should be:
>   The request line exactly as it came from the client.

Yes -- that is what a log is for.

> Unfortunately with directory indexing, this means that three different 
> requests all have the same semantic meaning:
>   GET /dirname
>   GET /dirname/
>   GET /dirname/index.html
> (Assuming that index.html is the dir index file, this too can vary.) Are 
> the current logfile processing programs taking this vagarity into 
> account?

Yes, it is a trivial thing to do -- wwwstat has done it since v0.1.

> I intend to log
>   GET /dirname/index.html
> in all cases where index.html existed, and
>   GET /dirname/
> in all cases where it doesn't, unless somebody can provide me with a 
> really good reason not to. 

Reason: it would by lying -- that is not the request it got, so it shouldn't
be logging it as if it was.  For instance, I am usually interested in cases
where there are a large number of requests for

    GET /dirname

since that usually means somebody has advertized (or included as a link)
the wrong URL for that dirname.  Your scheme would prevent me from finding
those cases in the logfile.

> One of the features of the server I am 
> writing will be reliable logging, so this is a little more important than 
> it might sound.

In that case, don't do it -- you just introduced an unreliability.
If the server mucks with the request, I can't rely on it for maintenance
and security checks.

 ....Roy T. Fielding  Department of ICS, University of California, Irvine USA
Received on Friday, 14 April 1995 04:41:25 UTC

This archive was generated by hypermail 2.4.0 : Monday, 20 January 2020 16:08:16 UTC