validator.nu (was: BOCU-1, SCSU, etc.)

Henri Sivonen wrote:

> I believe the list of encodings that are needed for existing
> content is pretty close to the contents of the encoding menu
> at http://validator.nu/

BTW, my usual "validator torture tests" strongly indicate that
http://validator.nu is unrelated to the concepts of "validator"
and "existing content".

1 - http://purl.net/xyzzy/home/test/res.htm and res.html:
    Quirky or not, HTML 2 strict and HTML i18n allowed those
    odd SGML comments.  AFAIK nothing is wrong with <tt> in <p>.

2 - http://purl.net/xyzzy/colour.htm intentionally uses "known"
    colour names, I fear they are quite popular in "existing
    content", maybe HTML5 should accept them as "legacy".  
    The validator found another issue I wasn't aware of, nice.

3 - http://purl.net/xyzzy/ibm850.htm has a DTD subset with some
    entity declarations, that's apparently not (yet) supported
    by http://validator.nu and FWIW also in no browser I know.

4 - http://hmdmhdfmhdjmzdtjmzdtzktdkztdjz.googlepages.com/IDN-IRI-test.html
    Unusable output for XHTML 1 sent as text/html for all pages,
    if a validator cannot validate XHTML 1 it shouldn't try to
    do it anyway.  "Preset XHTML 1" doesn't help to get the
    corresponding parser.  The XML parser refuses to validate
    text/html.  Third attempt, UTF-8 + XHTML 1 + XML + "lax"
    (whatever that means), and now the validator states that it
    doesn't know Content-Type: chemical/x-pdb.  
    Neither do I, it's not mentioned in the document or the DTD.
    Admittedly Google sends the DTD as application/octet-stream
    instead of application/xml-dtd, but that's not "chemical".

5 - All link rev="made" are reported as errors.

6 - Link elements without title are reported as errors even for 
    "existing content" where that's not required, and arguably
    pointless for some relations including "made" or "author".

7 - Validator.nu test aborted before the end of my test suite.

 Frank

Received on Friday, 1 February 2008 13:56:34 UTC