W3C home > Mailing lists > Public > public-html-comments@w3.org > February 2008

Re: validator.nu

From: Henri Sivonen <hsivonen@iki.fi>
Date: Sat, 16 Feb 2008 15:33:40 +0200
To: Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
Message-Id: <4E40BA81-5889-4563-9F42-36D2B3C1B631@iki.fi>
Cc: <public-html-comments@w3.org>

Disclaimer: Still not a WG response.

On Feb 16, 2008, at 14:25, Frank Ellermann wrote:

> Henri Sivonen wrote:
>
>> HTML5 parsing has no such thing as a valid DTD subset.
>
> <sigh />  If it cannot parse valid XHTML 1 it's fine,
> just don't offer the option, or give up with a clear
> error message when you "see" a DTD subset or anything
> else that won't fit into your model, valid or not.

Like I said before, that was how Validator.nu used to work and a  
change to the old behavior was requested. I cannot comply with  
everyone's suggestions at the same time when mutually exclusive  
behaviors are suggested. I have chosen not to comply with yours on  
this point.

> But this doesn't affect you or
> other validators, what they should do is answer the
> simple question:
>
> Is document X valid HTML / HTML5 / XHTML ?
>
> For any given X, independent of how you get it, HTTP,
> upload, FTP, pigeon carrier, gopher, form input, ...

Validator.nu checks the combination of the protocol entity body and  
the Content-Type header. Pretending that Content-Type didn't matter  
wouldn't make sense when it does make a difference in terms of  
processing in a browser.

> OTOH what you got as X, however you got it, *is* X,
> the valid or invalid input for validation.  What HTTP
> servers claim is at best *optional* additional info
> for the task to validate X.

Content-Type is acted on by browsers when provided, so even if  
supplying it were optional, looking at it once supplied isn't.

> If folks actually want to check X'' = X + HTTP header
> or X''' = X + charset or doctype overrides offer this
> as option.  As you already do it for X''' but not X''.

I also provide the lax type option to override the MIME type (albeit  
in a limited way to prevent Validator.nu loading images, movies,  
etc.). Respecting Content-Type is the default, though.

The main reason for adding the character encoding override was  
supporting the form-based file upload case, but I opted not to hide  
the UI in other cases.

>> Making the references to a misconfigured server is
>> under your control.
>
> Yeah, I could use form input or upload instead of a
> HTTP URL, or maybe set up a decent gopher server and
> let your validator tackle this.

What are you trying to achieve? Are you trying to check that your Web  
content doesn't have obvious technical problems? If you are, surely it  
would be less useful if the validator pretended that Content-Type  
didn't matter to parser choice when it does matter in browsers. Or are  
you just trying to game a tool to say that your page is valid while  
insisting on doing stuff that is practically problematic? If so,  
what's the point?

> If that is your idea of usability we are wasting time,
> as I can simply use validators doing what I want, i.e.
> check X, neither X' nor X'', and typically not X'''.

In order to assess whether doing what you want is a waste of time, I'd  
like to know what objective you have in mind in the use case sense.  
Why are you validating pages?

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Saturday, 16 February 2008 13:34:13 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 1 June 2011 00:13:58 GMT