Re: Deployed use of HOST Header?

Dave Morris writes:
    Is anyone aware of any effort to derrive a list of client software
    packages which include the "Host:" header with requests? It would
    be really helpful for my efforts to push server usage away from
    discrete IP addresses to be able to speak with some degree of
    authority.

    Nice to know what percentages of requests to major net destinations
    like Yahoo, Altavista, Infoseek, Excite, Netscape, IBM, Microsoft,
    etc. now include the HOST: header.
    
I thought about asking the folks at AltaVista for this info, but then I
realized that it would probably require reprogramming their server, and
this is something they like to avoid.

But by a strange coincidence, this week one of the other researchers
here yesterday started logging all of the HTTP headers through our
proxy here in Palo Alto.  And so I put together an AWK script to figure
out which User-Agents send Host.

By the way, we're not going to release these logs, no matter how nicely
people ask ... so please don't ask.  And also don't ask me what
fraction of requests were HTTP/1.1; the logs apparently don't include
that tidbit.

Out of 1963154 requests logged yesterday (May 18), 1946195 (99.1%) had
User-Agent headers.  For these requests, I looked for a Host header
(but without checking on its syntactic or semantic validity!), and then
made a list of User-Agent values associated with requests with or
without Host.

In some cases, the same User-Agent was seen with and without the Host
header.  I'm not really keen on doing the analysis to figure out what
really happened in all cases, but I looked at one (more or less random)
example; it looks like when that particular browser is invoked by
Quicken, it changes the request headers fairly significantly.  I
suspect that in other cases, some intervening proxy (we have several
layers of internal proxies within Digital) either added or removed Host
headers.

In any case, this means that these results should be taken with a large
dose of skepticism, since it appears that one cannot simply assume that
use of a given User-Agent will always result in the delivery of a Host
header to the origin server.

For these lists, I used only the first "word" (whitespace-delimited
string of characters) in the User-Agent header.  I tried analyzing a
subset of the log using the entire User-Agent header; it doesn't seem
to add much information, but it slows things down a lot.

Disclaimer: nothing here is meant as a criticism of any User-Agent
implementation, especially since my analysis could be erroneous.

-Jeff

User-agents that were never seen with a Host header:

    0101500608win16001
    0101600719win16001
    0101600719win16014
    0101600719win16042
    0101600720win16001
    0102001290win32001
    AVSMCPX
    Crescent
    Enhanced_Mosaic
    FFiNet32.DLL/3.1
    Go-Ahead-Got-It/1.1
    Lotus
    Lynx/2.3
    Lynx/2.3-FM
    Lynx/2.3.6
    Marimba
    Mozilla/1.12I
    Mozilla/1.2
    Mozilla/1.22
    Mozilla/3.0b9Gold
    NCSA
    NSPlayer/2.0
    PCNviewer
    Proxy
    QFNApp/1.0
    TSTPAV
    Tuner/1.1.1
    Tuner/2.0.2
    Update
    VXtreme,
    Visto-Assistant/Commercial-Release-2.0
    WebCopy/0.98b7
    www.pl/961205.24

User-agents that were always seen with a Host header:

    0102001735win32090
    0102001737win32001
    0102001737win32011
    0102001737win32024
    0102001790win32001
    0102001790win32051
    0102001790win32090
    0102001792win32001
    0102001792win32015
    0102001792win32024
    0102001792win32042
    0102001792win32043
    0102001792win32073
    0102502226win32001
    @%146%01L%146%01%7c%146%01%94%146%01
    Alexa
    Alexa/1.1.4.0%3bMicrosoft
    AlphaCONNECT
    BW-C-2.0
    CSymWebPage
    Caching-Manager/2.1
    Conveyer
    DMS-NetLink-GetLink
    HotJava/1.1
    InstallShield
    Investor
    Java1.0.2
    Java1.1
    Java1.1.4
    LiveUpdate
    Lotus-Notes/4.5
    Lynx
    Lynx/2.5FM
    Lynx/2.6
    Lynx/2.7.1
    Lynx/2.7.2
    Lynx/2.8rel.2
    MFC_Tear_Sample
    MSFrontPage/2.0
    MSFrontPageWpp/3.0
    MSInvestor
    MSN
    MSNBC-News-Alert/2.2
    MSNBC-News-Browser-IE/2.1
    MSWebPostPostInfoProcessor/1.5
    Mozilla/1.1
    Mozilla/2.01Gold
    Mozilla/3.01-C-MACOS8
    Mozilla/3.01C-DH397
    Mozilla/3.01C-KIT
    Mozilla/3.01C-WorldNet
    Mozilla/3.03
    Mozilla/3.04GoldC
    Mozilla/3.0C-NC320
    Mozilla/3.0C-WorldNet
    Mozilla/4.04j2
    NeoPlanet
    NetAttache/2.5
    Net_Vampire/2.4
    OilChange
    PhotoImpact
    PrimeNet3Win32
    RealPlayer
    Registration
    Scooter/1.1
    ServerComm
    SunLab's
    Teleport
    UPDATEIT
    WebTrends/3.0
    WebZIP/2.0
    Wget/1.4.5
    contype
    d%f0D
    libwww-perl/5.20
    xmcd/v2.2PL1
    
User-agents that were sometimes, but not always, seen with a Host header:

    GetRight/3.1
    Microsoft
    Mozilla/2.0
    Mozilla/2.01
    Mozilla/2.02
    Mozilla/2.02Gold
    Mozilla/3.0
    Mozilla/3.01
    Mozilla/3.01C-POLNET
    Mozilla/3.01C-SI304A01
    Mozilla/3.01Gold
    Mozilla/3.02
    Mozilla/3.02Gold
    Mozilla/3.03Gold
    Mozilla/3.04
    Mozilla/3.04Gold
    Mozilla/3.0Gold
    Mozilla/4.0
    Mozilla/4.01
    Mozilla/4.02
    Mozilla/4.03
    Mozilla/4.04
    Mozilla/4.05
    NSPlayer/3.0.0.2437
    Tuner/2.1.2

Received on Tuesday, 19 May 1998 12:07:10 UTC