W3C home > Mailing lists > Public > www-validator@w3.org > April 2008

Re: please add Accept header to http request containing application/xhtml+xml

From: Dean Edridge <dean@55.co.nz>
Date: Tue, 22 Apr 2008 00:28:17 +1200
Message-ID: <480C8861.4090903@55.co.nz>
To: Alexandre Alapetite <alexandre@alapetite.net>
CC: Etienne Miret <elimerl@gmail.com>, www-validator@w3.org, Olivier Thereaux <ot@w3.org>

Alexandre Alapetite wrote:
> Actually, just for the record, when a client does not send an Accept
> header at all, it has the same meaning as "Accept: */*".
>
> Furthermore, if a resource is coded for application/xhtml+xml, is
> should be sent as such when the Accept header is empty.
>
> Although far the full content negotiation specification, here is the
> Apache rule I use when I cannot use a proper content negotiation with
> a type-map (.var):
>
> <IfModule mod_rewrite.c>
> 	RewriteEngine On
> 	RewriteCond %{HTTP_ACCEPT} ^$ [OR]
> 	RewriteCond %{HTTP_ACCEPT}
> \bapplication/xhtml\+xml\b(?!(?>[^,]*?\bq=)0(?:\.0{1,3})?(?:\Z|[\s,;]))
> [NC]
> 	RewriteRule \.html$ - [type=application/xhtml+xml;charset=UTF-8]
> </IfModule>
>
> - It will serve application/xhtml+xml when the Accept header is empty
> (compatible with the current version of the W3C Validator, no
> warning).
>
> - When there is an Accept header, it will only serve
> application/xhtml+xml when "application/xhtml+xml" is listed and with
> a "q" higher than 0 (compatible with Internet Explorer, and of course
> Opera, Firefox, Safari, etc., but not fully correct as it does not
> check other q-values).
>
>
> A little remark: when you use this rule in combination with a content
> negotiation using a type-map (.var) for e.g. language negotiation, you
> should add text/html AND application/xhtml+xml  in the .var file for
> it to work as expected:
>
> URI: index.fr.html
> Content-language: fr
> Content-type: text/html;qs=0.5
>
> URI: index.fr.html
> Content-language: fr
> Content-type: application/xhtml+xml
>
> URI: index.en.html
> Content-language: en
> Content-type: text/html;qs=0.5
>
> URI: index.en.html
> Content-language: en
> Content-type: application/xhtml+xml
>
> URI: index.en.html
> Content-type: text/html;qs=0.5
>
> URI: index.en.html
> Content-type: application/xhtml+xml
>
>
> You can test it live on http://alexandre.alapetite.net/
>
> Cordially,
> Alexandre Alapetite
> http://alexandre.alapetite.net/cv/
>
>
>   

Hi,

Here's the problem with the W3C Validator as I see it.

The W3C Validator does not send an Accept header by default.
IMHO it is not good practise for any web server to send XHTML to a 
user-agent without first knowing that the user-agent accepts XHTML. This 
may seem like an odd concept to some but it is just the way things are 
if you want to use XHTML on the web today. In some situations you may 
get away with sending XHTML to everyone, but such situations are rare. I 
myself, will not send XHTML to a user-agent that does not specifically 
declare that it accepts application/xhtml+xml.

I've done some research and found that almost all other validators send 
an accept header [1].

At the moment people are having to do this sort of thing:

<?php
if (stristr($_SERVER['HTTP_ACCEPT'], "application/xhtml+xml")   ||
    stristr($_SERVER["HTTP_USER_AGENT"], "W3C_Validator"))
{
    $mime = "application/xhtml+xml";
}
else
{
    $mime = "text/html";
}
header("Content-Type: $mime; charset=utf-8");
?>

When they could actually just do this if the validator had an accept header.
<?php
if (stristr($_SERVER['HTTP_ACCEPT'], "application/xhtml+xml"))
{
    $mime = "application/xhtml+xml";
}
else
{
    $mime = "text/html";
}
header("Content-Type: $mime; charset=utf-8");
?>

(This could actually be simplified further of course...)

Having to make a special exception for just one user-agent goes against 
the whole idea of web standards.

Solutions such as below are not suitable for main stream web page 
validation.

From:  http://validator.w3.org/docs/users.html#option-accept
> For Content-Negotiated resources, set a specific Accept Header (accept)
>
> This option (experimental, as of 0.8.2) is useful if your Web server 
> is set up to use format negotiation, serving different content based 
> on the preferred/accepted media types of the user-agent. The validator 
> can then emulate different HTTP Accept behaviors.
>
> For example, append "accept=application%2Fxhtml%2Bxml%2C*" and the 
> validator will send the HTTP Header "Accept: application/xhtml+xml,*".

Such techniques are for 'tech-savy' people and not suitable for an 
average member of the public that just wants to check the validity of a 
webpage. The validator should be usable by anyone who can copy & paste a 
URL into the input field and press the submit button.

So here is the ideal solution to this problem in my opinion.

Part a) Have the W3C Validator send, by default, an appropriate Accept 
header with all the media types that it supports. This could look 
something like this:
Accept: application/xhtml+xml, application/xml; q=0.5, text/html; q=9, 
text/xml, application/smil, application/smil+xml, image/svg+xml; q=0.3, 
;*/* q=0.1
(I don't really know all the media types it accepts, or the correct q 
values, but I'm sure someone here does).

Part b) Have an ability for users/webdevelopers to add their own 
media-types that would override the default Accept settings that would 
normally be sent by the validator. This is explained above, but I feel 
that it is only beneficial to maintainers of the website being validated 
or other tech like people, so I do not think this would be a total 
solution on it's own.

This would mean that by default the validator would send an appropriate 
accept header just like all good web browsers do today.

If this is not do-able for some reason, another approach and 'quick fix' 
would be to just 'hard code' this string into the W3C Validator request 
headers:
Accept : application/xhtml+xml, text/html; q=0.5, */*; q=0.1


What do others think about this?


[1] http://validxhtml.org/validators/accept-header/


Dean Edridge
Received on Monday, 21 April 2008 12:29:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:29 GMT