- From: N.G.Smith <ngs@sesame.hensa.ac.uk>
- Date: Thu, 16 May 1996 13:51:20 +0100
- To: www-logging@w3.org
What are people's thoughts on standardising on a binary version of the Extended Log File Format? We perform a number of analyses on our log files and at 300MB a day this would not be feasible unless we converted them to a binary format. This conversion takes about 4 CPU hours each day. Having our server dump a binary file in the first place would seem sensible. The binary files that we produce have a number of advantages: The files are smaller Some fields can be enumerated types IP addresses are just 4 bytes Repeated strings are held in a separate strings table They are faster to access Binary data does not have to go through a conversion process Timestamps in the file allow you to pin-point records Searching, sorting and collating are orders of magnitude faster The biggest disadvantage is the more complex logging procedure, but we already have code to do that. Of course, they are not human-readable either, but then who wants to read 300MB of log file each day. Other sites must have similar problems with big logs, and although, as a cache, I don't anticipate that passing round compressed ASCII log extracts will be a big problem, a binary standard would ensure that munging tools remain interoperable. Thoughts? Neil.
Received on Thursday, 16 May 1996 08:53:20 UTC