- From: Daniel DuBois <ddubois@spyglass.com>
- Date: Thu, 25 Apr 1996 14:23:06 -0500
- To: www-logging@w3.org
I'm in the process of starting to again work on the XLF portion of our next
generation server, and I believe I told this group I would analyze the
numerous prefix identifier combos. Since I have to do it anyway to
implement the log configuration and the logging functionality of the server,
here it goes.
I'm only concerned with the configuration of an origin server at this point.
As such, no prefix that contains a "r" makes any sense (for an origin server
to generate in a log file that is). Even if my origin server were talking
to a proxy, it wouldn't know it, it would just perceive the requestor as any
other client, so I won't include any combos below that are prefixed by r-,
sr- or rs-. (See Dan pass the buck. Pass, Dan, pass!)
c-ip
This would be the ip address of the client that initiates the connection.
This is easily available from the whatever your Net_Accept code does.
s-ip
This I presume is the IP address that this particular connection came in on.
For HTTP/1.0 servers, this would probably be used to distinguish virtual
hosts in the log. (For the Spyglass server it wouldn't be useful since each
virtual host gets it's own logfile, but to each his own. I see a #IP
field directive being more useful for our purposes, and I believe that
has been discussed?)
cs-ip
sc-ip
These probably no longer make any sense given the addition of c- and s-. It
seems likely to me that the groupings of "prefixable ids" and "nonprefixable
ids" should be broken up into "ids with direction prefixes", "ids with
position prefixes", and "nonprefixable ids".
Obviously "c-" and "s-" are positional prefixes, with he others being
"directional". The terms "positional" and "directional" and almost
certainly the worst anyone could think of, and different ones should be made.
c-dns
This is what we get when the server does the DNS lookup of the c-ip.
Given the current DNS logging options available in the Spyglass server, this
could be '-' all the time, '-' on non-CGI scripts, or never '-'. I expect
post-processing analysis tools would generate the values more often than
origin servers.
s-dns
This one I'm not so sure about. It could be just the virtual host name
the origin server internally associates with the s-ip from above, or it
could be the contents of the Host: header when 1.1 becomes prevalent. There
is the issue of what do we log in the case where those two things are not
the same. (There will be a transitional state where people will keep
multiple IP addresses per machine, using Host: when available, and
checking the ip address the request came in on when it's not. In that case
the request may have come on over the 'main' Ip address, but have a Host:
that would have indicated the 7th IP address.) Does it have to be a FQDN
or can it be a plain hostname like "www" which obviously means something
different if looked at from outside the scope of the log-generating machine?
cs-dns
sc-dns
These are some more that probably no longer make any sense given the
addition of c- and s-.
sc-status
The return code of the HTTP resposne, like 200 OK. Simple enough.
c-status
I don't know what this is suposed to mean.
s-status
I don't know what this is suposed to mean, uless it's redundant with
sc-status, in which case it should be a id that takes "directional prefixes".
cs-status
I don't know what this is suposed to mean.
c-comment
No clue. How does a client make a comment that a server can log?
s-comment
I guess this would be something a server specific application would use, or
maybe request-based errors, like "This request failed authentication" or
"This request had a network read failure before it the full request entity
body was read." I probably wouldn't generate it, but that's me.
cs-comment
No clue. How does a client make a comment that a server can log?
sc-comment
I don't know what this is suposed to mean.
cs-method
Obviously POST, GET, etc.
c-method
s-method
sc-method
I don't think any of these make any sense, except maybe c-method, but even
that is probably inappropriate because of it's redundancy.
cs-uri
The request URI. Could be an absolute URI is your server accepts those.
Bascially anything that occurs after the first LWS after the method, until
the next LWS.
c-uri
Redundant?
s-uri
I could concieve of this being meaningful if used to indicate some internal
URI-translating done by the server. For instance, if to handle virtual
hosting, a server translates www.joe.com/products/phones.txt to
/joe/products/phones.txt. But even that's a strech, and that's more of a
URI to internal file system mapping issue than anything. So this is
probably not sensical.
sc-uri
I don't know what this would be used for, unless perhaps if you use this
fields to record Location redirection repeonses, or maybe, Content-Location:
content negotiated reponses.
These last two provoke the question: If we have prefix-id combos that could
conceivably mean something to someone, and be useful to them, is it a
requirement that the appropiate use of theses IDs be described in the draft
standard so that log files are understandable by all, not just those
individuals/groups who generated a particular file with the questionable
prefix-id combo? If we are trying to standardize on IDs, I would think that
requires those ids to have standard meanings.
cs-uri-stem
Same as cs-uri minus the first '?' and everything after it (if there is a
'?'). [I could be wrong about minus the '?', the draft doesnt really say
that].
c-uri-stem
Redundant?
s-uri-stem
Same issues as s-uri. Same dropping the '?' + trailer as cs-uri-stem.
sc-uri-stem
Same issues as sc-uri. Same dropping the '?' + trailer as cs-uri-stem.
cs-uri-query
Same as cs-uri minus everything up to and including the first '?'. If there
is no '?' in the URL, or if there is nothing after the '?', I believe this
entry would be '-'.
c-uri-query
Redundant?
s-uri-query
Same issues as s-uri. Same dropping the stem + '?' as cs-uri-query.
sc-uri-query
Same issues as sc-uri. Same dropping the stem + '?' as cs-uri-query.
So to sum up: If I, as an origin server configuraiton file parser see prefix
1) "c"
I expect to see next: ip or dns. Anything else I would flag an
error or an unsupported feature.
1) "cs"
I expect to see next: (requestheader), method, uri, uri-stem,
uri-query. Anything else I would flag an error or an unsupported
feature.
1) "s"
I expect to see next: ip, dns, or comment. Anything else I
would flag an error or an unsupported feature.
1) "sc"
I expect to see next: (responseheader), status, or possibly uri,
uri-stem, or uri-query. Anything else I would flag an error or an
unsupported feature.
Does anyone think I've shortchanged any of the combos above? Does anyone
disagree that the prefix-id combos should have standardized meanings to be
useful and present in the draft? (IMO, they should be x-foo if they mean
different things to differnet organizations.)
I've neglected the possible addition of "authname" which would be the
usename in the basic or digest authentication, similiar to the entry in the
CLFF. I would expect only one of "c-authname" or "cs-authname" to be used.
-----
the Programmer formerly known as Dan
http://www.spyglass.com/~ddubois/
Received on Thursday, 25 April 1996 15:24:49 UTC