Re: Validator doesn't send HTTP_ACCEPT headers, "Conflict between Mime Type and Document Type" warning is incorrect.

Sierk Bornemann wrote:
>
>
> Am 31.07.2007 um 20:35 schrieb Andries Louw Wolthuizen:
>
>> Because I don't want to send application/xhtml+xml to (older) 
>> browsers that don't send a HTTP accept header, I've set the default 
>> mime-type on text/html.
>>
>> My PHP script is now:
>> <?php
>> $charset = 'utf-8';
>> $mime = 'text/html';
>> if(!empty($_SERVER['HTTP_ACCEPT']) && 
>> stristr($_SERVER['HTTP_ACCEPT'],'application/xhtml+xml')){
>>     
>> if(preg_match('/application\/xhtml\+xml;q=([01]|0\.\d{1,3}|1\.0)/i',$_SERVER['HTTP_ACCEPT'],$matches)){ 
>>
>>         $xhtml_q = $matches[1];
>>         
>> if(preg_match('/text\/html;q=q=([01]|0\.\d{1,3}|1\.0)/i',$_SERVER['HTTP_ACCEPT'],$matches)){ 
>>
>>             $html_q = $matches[1];
>>             if((float)$xhtml_q >= (float)$html_q) {
>>                 $mime = 'application/xhtml+xml';
>>             }
>>         }
>>     }else{
>>         $mime = 'application/xhtml+xml';
>>     }
>> }
>> if(stristr($_SERVER["HTTP_USER_AGENT"],"W3C_Validator") OR 
>> stristr($_SERVER["HTTP_USER_AGENT"],"W3C_CSS_Validator") OR 
>> stristr($_SERVER["HTTP_USER_AGENT"],"WDG_Validator")){
>>     $mime = "application/xhtml+xml";
>> }
>> if($mime == 'application/xhtml+xml') {
>>     $prolog_type = '<?xml version="1.0" encoding="'.$charset.'" 
>> ?>'.PHP_EOL;
>> }else{
>>     $prolog_type = '';
>> }
>> header('Content-Type: '.$mime.';charset='.$charset);
>> header('Vary: Accept');
>> echo $prolog_type;
>> ?>
>
> And now the whole thing *WITHOUT* using a scripting language like PHP 
> and only based on the methods and rules, a webserver like Apache 
> provides. Not every website runs PHP...
>
>
> --Sierk Bornemann
> email:            sierkb@gmx.de
> WWW:              http://sierkbornemann.de/
>


Fooling around a little with Apache, I designed the following set of rules.
This should work with Apache 2+, provided the headers module 
(mod_headers.c) is enabled.

# Change "/test/" to the path/file for which this
# "content negociation" should be enabled.
# This could be "/" to enable it for all requests.
<Location "/test/">
        # If the browser recognizes XHTML, use that.
        SetEnvIf Accept "application/xhml+xml" use_xhtml

        # If faced with W3C's validators, force an XHTML type.
        BrowserMatchNoCase "W3C_(CSS_)?Validator" use_xhtml

        # Change the Content-Type header when needed.
        # This requires mod_headers to be enabled.
        <IfModule mod_headers.c>
                Header set Content-Type application/xhtml+xml env=use_xhtml
                # I'm not sure whether the Vary header is mandatory,
                # but just to be on the safe side.
                Header add Vary Accept
        </IfModule>

</Location>


Just a few notes:
- First and foremost, I didn't have time to test this, so it may 
perfectly work, but it may also completely fail.

- This set of rules uses the client's Accept header, so Sierk will 
probably not be satisfied with it.

- It doesn't quite do as much as Andries' PHP script does, for example, 
it doesn't send the XML prolog for XHTML documents (so if you need to 
use an encoding other than the default UTF-8/UTF-16, don't use it!). It 
doesn't send any DOCTYPE either (because Apache can't do that AFAIK).

- Last but not least, it doesn't take into account the q(uality) 
parameter of the client's Accept header.
So that a client sending an Accept header such as:

Accept: 
|text/xml,application/xml,*application/xhtml+xml*|*|;q=0.9|*|*,text/html*,text/plain;q=0.8,image/png,*/*;q=0.5

|(notice how XHTML had a quality parameter of 0.9, while HTML doesn't 
provide any quality indication [and therefore has q=1])
This client would receive the XHTML type, despite HTML being the 
preferred format...

Anyway, I don't think such a scenario is plausible, since IMO it doesn't 
make sense to prefer HTML over XHTML...
(either you fully support XHTML and thus it is preferred over HTML, or 
you don't in which case you simply don't mention XHTML at all in the 
Accept header).


Final note: I think Andries' script is a variation of the one found at
http://keystonewebsites.com/articles/mime_type.php
and in fact, I like the script there more because it doesn't rely on the 
User-Agent header.
I use a viariation of this script myself, the only difference being that 
I use application/xhtml+xml as my default MIME type instead of 
text/html. This simply results in browsers that do not send an Accept 
header being served XHTML instead of HTML. This also means that old 
browsers won't be able to access my site (because they will be sent 
XHTML), but I consider it affordable considering my audience.


Hope this helps and sorry for the long post.

Received on Tuesday, 31 July 2007 22:51:27 UTC