W3C home > Mailing lists > Public > public-html@w3.org > January 2012

[Bug 15380] New: Define a User-Agent string format subset (liason witth HTTP people etc)

From: <bugzilla@jessica.w3.org>
Date: Mon, 02 Jan 2012 07:38:21 +0000
To: public-html@w3.org
Message-ID: <bug-15380-2495@http.www.w3.org/Bugs/Public/>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15380

           Summary: Define a User-Agent string format subset (liason witth
                    HTTP people etc)
           Product: HTML WG
           Version: unspecified
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 spec (editor: Ian Hickson)
        AssignedTo: ian@hixie.ch
        ReportedBy: xn--mlform-iua@xn--mlform-iua.no
         QAContact: public-html-bugzilla@w3.org
                CC: bzbarsky@mit.edu, hsivonen@iki.fi, mike@w3.org,
                    public-html-wg-issue-tracking@w3.org,
                    public-html@w3.org, julian.reschke@gmx.de,
                    xn--mlform-iua@xn--mlform-iua.no, annevk@opera.com,
                    adrianba@microsoft.com, Ms2ger@gmail.com,
                    VYV03354@nifty.ne.jp, dbaron@dbaron.org,
                    tross@microsoft.com
        Depends on: 15359


PROPOSAL:
* The goal of HTML5 is to create an open specification of the Web, so
  that a new browser vendor possibly could use the spec in order to
  create a new browser that would be compatible with the Web.
* To that end, it has showed itself necessary to define a User-Agent
  string subset format, in order to keep everyone aware of the issues
  and problems that UAs can cause
* Important goals: avoiding fingerprinting and unconscious UA sniffing
* WebSites which do UA sniffing and which support HTML5, could be
  asked to treat any browser which uses the new format as compatible.
  (Example: As a frequent user of the Webkit clone iCab, I too often
  see messages telling me to "upgrade" my browser despite that I use
  a browser based on the Same Webkit clone as Safari. This happens on
  Facebook, Google AdWords on MobileMe etc.)

A USE CASE/EXAMPLE:

(1) Opera recently changed what they do with malformed XML (they started to
"fall back" from XML to HTML parsing - rather than display in XML fatal errors
- if the server delives the document with the HTTP Content-Type header
"application/xml+xhtml")

(2) However, the real issue in Opera's case, is user-agent string sniffing: ASP
servers that are deployed with a list of User-Agent strings, which does not
include the User-Agent string of Opera. As a result, the UAs not on the list,
get the page served as 'application/xhtml+xml' rather than as 'text/html'. With
the result that the text/html-authored page trigger XML fatal error. An idiot
case. But Opera of course needs to handle it.

(3) It is important to note that Opera's user agent string differs greatly from
the strings of UAs based on Webkit, Trident and Gecko: 
 * It doesn't include the 'Mozilla/0.0' token anywhere; 
 * the first token differs from those of the others;
 * Less controversly, but still an issue: It does not 
   include the the strings 'msie 0.0' or 'firefox/0+"
   or "safari/0+" anywhere. (I outlined these issues 
   here:
http://lists.w3.org/Archives/Public/www-tag/2012Jan/0003.html.) 


When looking at a list of 10.000 UA strings, it becomes clear that there are
many other browsers than Opera that could justify implementing the same XML
error handling as Opera has implemented - as only a little above 1000 strings
fullfil the requirements I listed above. See:
http://www.useragentstring.com/pages/All/


FORMAT DISCUSSION:

* A User-agent string format that works with ASP (and which Opera thus could
have picked instead of changing how they treat XML - except that ASP of course
is not the only thing to care for):

"Nubrowser/0 (Boilerplate: Mozilla/0.0/BOMSIE 0.0)"

 Explanations:
 - '0' could be any integer between 0 and 9
 - ASP needs to see 'Mozilla/0.0' somewhere   
 - The string 'BOMSIE' works because it encompasses 'MSIE'.
 - Of course, one could make up many other acronyms, such
   as "UAMSIE":
   "User Agent Management Science and Industrial Engineering"
 - Of course, many other acronyms would be posssible, e.g:
   "WebSafari/0" and "WebFirefox/0"

* While testing the above string, I learned that Facebook needs to see
"KnownBrowser/5.0" as the first string. (Aparently, Opera made it to that
list.)

REFERENCES:
* The UA string format is defined in HTTP:
  http://tools.ietf.org/html/rfc1945#section-10.15

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Received on Monday, 2 January 2012 12:20:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:17:43 GMT