new feature: parameter for specific User-Agent identification

Hello,

Following some recent discussions with a number of you, I have started  
work on making the Markup Validator identify itself with any specific  
User-Agent HTTP header.

This is not a new question, it has been discussed here, and on the  
bugzilla, in the past.

e.g:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=1127
http://lists.w3.org/Archives/Public/www-validator/2005Mar/0186.html


It is likely that I am on the record somewhere as thinking this is not  
a good idea. I still believe that there is potential for abuse, and  
that by default it is a good thing that the Markup Validator  
identifies itself with its own user agent string. However, for some  
particular cases, setting the user-agent string could help:

* it can be used to debug cases where a server produces different  
markup when accessed by a "known browser" user-agent or an "unknown"  
UA string. See the (too) many "validates by direct input but not  
online" messages from puzzled ASP.net server users.

* it can be used to debug cases of servers blocking the validator's UA.

* it can be used to specifically validate content views triggered by  
some specific user agent string (e.g mobile versus "full" view of a  
web resource). Markup checking within the mobileOK checker would be a  
customer here.

I therefore suggest implementing:
* a user-agent parameter for the check script
* values can be:
   - auto (equivalent to none) -> the usual validator UA applies
   - forward (and maybe "referer" too?) -> will forward the UA string  
as received by the validator. Modulo some sanitizing?
   - mobileok -> will output the UA as defined in http://www.w3.org/TR/mobileOK-basic10-tests/#http_request
   - any other string -> sanitize? and use as UA HTTP header.


However, given my note of concern above, I suggest we keep this as a  
parameter which can be used in the "API" calls and added the the check  
URIs, but not add any graphical UI.

If anyone is interested in hacking on this, please contact me within  
the next couple of days. A patch seems fairly straightforward to  
apply, so this could be a good first hack for anyone willing to code a  
bit on the validator. Changes should be made around line 450  
(recording the parameters) and 1316 (where the User-Agent sent by the  
validator is set) of http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.588&content-type=text/x-cvsweb-markup

Test cases/ Algorithms (which may not be necessary since we use the  
perl CGI module to handle the input)  for the sanitizing of User-Agent  
string input also welcome.

Thanks.
-- 
olivier

Received on Wednesday, 11 June 2008 18:44:37 UTC