Date: Thu, 22 Apr 1999 23:12:26 PDT To: "Brodsky, Lloyd" <Lloyd.Brodsky@thomson.com>, "'email@example.com'" <firstname.lastname@example.org> From: Jim Pitkow <email@example.com> Message-Id: <99Apr22.231241pdt."364935"@louise.parc.xerox.com> Subject: RE: Just what is a certified log? and who certifies it? Hi Llyod, At 06:52 AM 4/22/99 , Brodsky, Lloyd wrote: >Different goal certainly call for different methods -- but I'm still having >some problems visualizing what the repository's certification process would >consist of and what potential problems that process would avoid. The notion >of the RECIPIENT being able to certify anything other than its own identify >and mere receipt of a log file is an interesting one and a concept that I'd >like to hear more about. Ah, we may be using the term certify and validate loosely. The goal is to provide a repository of logs that have been described with a set of meta-data that captures information about the log and the nature of its contents, the data in the log validated against what it is supposed to be (data type checking, etc.), and described statistically by extracting the distribution of various metrics. The final entries and meta-data are inspected by editors for correctness. By providing these processes and checks, we hope to be able to create a forum whereby results from various logs can be validated by other researchers as well as facilitate new research (as diverse log files are a precious commodity). >I've been doing traffic analysis work with a number of Thomson companies and >I'd like to help. I'm just trying to visualize what you'd do to certify, >say, the Toronto Globe & Mail's 3.5 gig of extended format logs a week (the >largest of the 34 Thomson newspapers) were I to send them over. This could provide a nice test case for the repository (as well as provide a very interesting set of data to the research community). The basic process is described above, but the developers of the repository are currently working on a white paper that describes in more detail the process of validation/certification and will post to the list soon. What tools do you currently use to analyze the traces? What do you find are the limitations of the current tools? Jim.