[Prev][Next][Index][Thread]

http://www.w3.org/pub/WWW/TR/WD-logfile.html



Not much being said in this group, eh?  Well, hopefully this will pick it
up.  I have a number of issues/questions regarding the Extended Log File
Format draft that I'd like to share.

Firstly, I think there should be another identifier for authentication name
(like what is present in the CLFF).  (I'm suprised there haven't been more
request along this line, other than the geographic business being talked
about earlier, no one has requested additional identifiers?)

Secondly, I think there needs to be some more explicit language with regards
to the meaning of identifiers when prefixed.  And more explicit language
about what it means when each identifier is un-prefixed.  Adding to the
complexity, it would seem different prefix-identifier combos would mean
slightly different things based on whether one was viewing a log file for a
origin server, or a proxy server.

For instance, what does "sc-dns" mean on an origin server?  Is that supposed
to be what the server thinks it's DNS name is?  Is it supposed to be the
hostname the client requested (via Host: or full URL or...)?  Is that
supposed to be what the server thinks the client thinks the server's DNS
name is?  Is it nonsensical given the addition of the "s-" and "c-" prefixes?

I'd like to see a complete list of all possible prefix-identifiers
combinations that make sense, and explanations for the less obvious ones.
It seems quite dubious to me that every possible identifier that needs or
can take a prefix necessarily can take ANY prefix.  Obviously, there would
be two different "explanation" lists, one for origin servers, one for proxy
servers.

I'd also like to see the language "The following identifiers do no require a
prefix" changes to "The following identifiers cannot have a prefix" to
resolve any ambiguity.  What would a c-date mean?  What would a cs-date mean?

My frame of mind comes from this.  We'll be implementing the Extended Log
File Format functionality in out next release of the Spyglass Server, and
the configuration of the server's logging output will be ELFF fields
embedded in our config file, like so:

[ConnectionsLogs]
standard.txt     CLF     { clf }
extended.txt     XLF     { time cs-method cs-uri }
useragnt.txt     CLF     { clfdate cs(user-agent) }

(You can see, we're going to be internally adding two identifiers to the mix
called clf and clfdate that are only appropriate for CLF style logs.  CLF
style logs will not have #directives either.  That's an internal
configuration issue - I don't think the group or the draft needs to be aware
of this kind of thing.)

So as you can see, it's not just an issue of I, as a origin server, only
spitting out things I understand, I also have to validate the data fields my
users attempt to configure, so I need a meaningful way of knowing when to
draw the line and say "Hey, that field doesn't make any sense to me".  And
right now, there are alot of possible prefix-id's combos that don't make
sense to me.

If it will please anyone, I will volunteer to make a complete list of what I
think is sensible/allowable for an origin server, and then you can all call
me names for leaving foo-bar combo off, and then proceed to explain to me
it's function.  I think the exercise, while boring, may yield a little light
about how to better define the identifiers.

PS: Shouldn't the comment id be of type <string>
PSS: The definition of name type says it's for DNS names, but identifier
method is of type <name>
PSSS: I don't know if it's appropriate to say uri-query is of type <uri>


-----
the Programmer formerly known as Dan          
                                     http://www.spyglass.com/~ddubois/


Follow-Ups: