Re: Real problem caused by CSS validators willingness to tacitly accept HTML

Philip Taylor (Webmaster) wrote:

> We changed the behaviour of IIS to serve its default 404
> page instead of our customised one /after/ reporting the
> problem because debugging real problems here was being
> confused by the behaviour of the validator.

I see. But as far as I can see, the validator's behaviour was and is correct 
(though with some room for improvement as I mentioned), and the problem is 
in IIS configuration, or specifically the HTTP response it sends.

> If you are willing,
> please repeat your test for this non-existent page :
>
> http://jigsaw.w3.org/css-validator/validator?uri=http://www.rhul.ac.uk/resources/Stylesheets/CSS/TP-Common.css

I get the same results as Yves Lafon, who explained the situation well. From 
the protocol viewpoint, the URL 
http://www.rhul.ac.uk/resources/Stylesheets/CSS/TP-Common.css refers to an 
HTML document, which is not served as an error document but as a normal 
response.

> But as the validator was asked to validate
> "<something>.css", it should ?surely? complain (or warn the user
> loudly) if it is instead sent <something-else>.html (IMHO).

There might be some practical reasons for issuing a warning, i.e. reporting 
a mismatch between the last few characters of a URL and the declared media 
type, when those last few characters are known to be commonly used for data 
of another type. But I doubt that. Such checks would some day result in 
confusing error messages, and maintaing a table of commonly used suffixes 
associated with media types would be nontrivial and would not be 
particularly related to checking CSS or HTML.

There is no law and no protocol against using a URL ending with .css for an 
HTML document, or for a plain text file, or a Word document. It might be 
unwise, but that's a different story. Consider a URL like
http://www.example.com/analyze.php?url=http://www.com.example/zap.css
which might be quite natural, for a page used for analyzing a CSS file and 
issuing a report in HTML format (or plain text format, or some other 
format - not deducible from the URL). Actually the first URL quoted above is 
of such a type! The suffix ".css" has nothing to do with the media type of 
the resource (the validator's report).

It would be incorrect for a validator, or any user agent, to treat the 
response as anything but data of the declared type. (We know that IE does 
such things, second-guessing the media type from the URL suffix or the 
content of the data or both, but it's still incorrect.) Even if the actual 
data consisted of, say, a PDF document and cannot thus be interpreted as 
HTML or as CSS, it is still to be treated as intended to be PDF, just 
malformed.

> But if it is asked to validate a resource ending in ".CSS", and
> receives a resource with MIME type "text/html", surely it should
> report a configuration error or worse ?  (The configuration of
> the server, that is, not of the validator).

No, that's not an error at all. Perhaps a pragmatically wrong choice, but 
the validator cannot know that.

>> But admittedly the wording is slightly misleading when it says
>>
>> "This document validates as CSS!"
>>
>> for a document that is not a CSS stylesheet. An adequate statement
>> would be something like
>>
>> "The CSS style sheets included in the submitted document or referred
>> to by it are syntactically correct."
>
> But what is echoed by the validator is /just/ the CSS, leading the
> user to believe that that is what it received ...

Well, it more or less is, when considering what the validator proper 
received, after extraction of CSS code for analysis. But the wording could 
be better and probably should more explicitly say what happened.

Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/ 

Received on Wednesday, 31 October 2007 18:52:08 UTC