Re: user agent label format

Based on many years of writing all sorts of convoluted weirdo internet based
stuff, I have come up with the attached C solution to the problem... be
warned, it is not 100% - but it works in most cases.

I basically use this to process the useragent, then assign each unique
string an ID and store it in a database for later use in apps ranging from
web stats to serving different content to different browsers.

If you want to test it - rip out a few thousand lines from a useragent log,
then pass them through the attached function. You should then get the idea.

Regards,

ap.



----- Original Message -----
From: Koen Holtman <koen@win.tue.nl>
To: Dan Connolly <connolly@w3.org>
Cc: <dberansky@ucsd.edu>; <www-talk@w3.org>
Sent: Wednesday, September 15, 1999 2:27 AM
Subject: Re: user agent label format


> Dan Connolly:
> >
> >Dmitry Beransky wrote:
> >>
> >> It seems that despite the fact that most agents (at least the ones I
pulled
> >> off my server logs) follow the format specified in the HTTP spec, it's
> >> still impossible to determine with 100% accuracy the agent's make,
model,
> >> os platform, etc., since most of this info is stored in an unstructured
> >> comment field.  Is there any work done on trying to further standardize
the
> >> agent data?
> >
> >Not that I'm aware of.
>
> Some additional observations: the HTTP spec implies that the
> User-Agent field is free-form, and in at least one browser (Lynx), the
> end users can set this field to anything they want.
>
> This implies to me that developing a standard format for the User-Agent
> field is a lost cause.  The best hope with respect to reliable browser
> identification is the wide deployment of one of the content negotiation
> mechanisms referenced elsewhere in this thread.
>
> >Dan Connolly, W3C
>
> Koen.
>

Received on Tuesday, 14 September 1999 15:19:20 UTC