- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Sat, 2 Feb 2008 14:45:15 +0200
- To: "Frank Ellermann" <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
- Cc: <public-html-comments@w3.org>
Disclaimer: This is not an official WG response. I am, however, the developer of Validator.nu. On Feb 1, 2008, at 15:57, Frank Ellermann wrote: > Henri Sivonen wrote: > >> I believe the list of encodings that are needed for existing >> content is pretty close to the contents of the encoding menu >> at http://validator.nu/ > > BTW, my usual "validator torture tests" strongly indicate that > http://validator.nu is unrelated to the concepts of "validator" > and "existing content". From the about page: | No DTD-Based Validation | | * Validator.nu does not check for XML 1.0 validity constraints. That is, DTD | validation is not performed. “Validation” and “validator” in the name and | the user interface of the service refer to the ISO/IEC FDIS 19757-2 | definition of “validator” (which performs validation), to the Schematron | “validation” function (which is performed by a validator), and to the | HTML 5 definition of “validator”. | | * Validator.nu does not perform the duties of a “validating SGML parser” as | defined in ISO 8879. In fact, this service does not have any SGML | functionality at all. In particular, the HTML 4.01 support uses the HTML5 | parser with some additional error conditions. http://about.validator.nu/ If you don't like the terminology of RELAX NG and Schematron "validation" being performed by a "validator", I suggest sending feedback to the ISO/IEC FDIS 19757 committee. As far as HTML 5 validation goes, experience showed that calling it validation rather than conformance checking made communicating with people easier. Validator.nu is a tool for authors. It isn't designed as a tool for checking someone else's existing HTML 2.0 or 3.2 content as HTML 2.0 or 3.2. I'm not interested in delivering a tool to authors who try to make a point by authoring new HTML 2.0 or 3.2 content today. Supporting HTML 2.0 or 3.2 would not be cost-effective. > 1 - http://purl.net/xyzzy/home/test/res.htm and res.html: > Quirky or not, HTML 2 strict and HTML i18n allowed those > odd SGML comments. AFAIK nothing is wrong with <tt> in <p>. Validator.nu does not support HTML 2.0 and doesn't claim to. However, it checks the content as HTML5 for your convenience in case you are an author seeking to upgrade an existing site template to HTML5. > 2 - http://purl.net/xyzzy/colour.htm intentionally uses "known" > colour names, I fear they are quite popular in "existing > content", maybe HTML5 should accept them as "legacy". > The validator found another issue I wasn't aware of, nice. Note that by default, Validator.nu tries to use the XHTML 1.0 Transitional schema with that page. That spec defers to HTML 4.01 which only allowed 16 colors: http://www.w3.org/TR/html401/types.html#h-6.5 Currently, the HTML 5 draft doesn't permit presentational color- setting attributes at all, so the issue of permitted value space is moot. > 3 - http://purl.net/xyzzy/ibm850.htm has a DTD subset with some > entity declarations, that's apparently not (yet) supported > by http://validator.nu and FWIW also in no browser I know. If it isn't supported in any browser, it would be less useful if the validator didn't point out the problem, wouldn't it? You can manually override the parser to XML with external entity resolution if you wish to check XML documents that aren't suitable for use with Web browsers: http://validator.nu/?doc=http%3A%2F%2Fpurl.net%2Fxyzzy%2Fibm850.htm&parser=xmldtd&laxtype=yes > 4 - http://hmdmhdfmhdjmzdtjmzdtzktdkztdjz.googlepages.com/IDN-IRI-test.html > Unusable output for XHTML 1 sent as text/html for all pages, http://hixie.ch/advocacy/xhtml > if a validator cannot validate XHTML 1 it shouldn't try to > do it anyway. Previously, Validator.nu simply halted in that case. You are the first person to suggest that the current behaviour weren't more useful. I think I'm keeping the current behavior. > "Preset XHTML 1" doesn't help to get the > corresponding parser. That's because the XHTML 1.0 schemas are also used for HTML 4.01. > The XML parser refuses to validate text/html. text/html is not an RFC 3023-compliant XML media type. > Third attempt, UTF-8 + XHTML 1 + XML + "lax" > (whatever that means), Lax means disrespecting RFC 3023 for the purpose of text/xml encoding default and disrespecting the meaning of text/plain and text/html. > and now the validator states that it > doesn't know Content-Type: chemical/x-pdb. > Neither do I, it's not mentioned in the document or the DTD. You include an external entity from elsewhere. $ telnet validator.w3.org 80 Trying 128.30.52.49... Connected to validator.w3.org. Escape character is '^]'. HEAD /sgml-lib/REC-xhtml1-20020801/xhtml-lat1.ent HTTP/1.1 Accept: */*; q=0.1, application/docbook+xml, application/xhtml+xml, application/xml; q=0.5, image/svg+xml, text/xml; q=0.3 Host: validator.w3.org Connection: close HTTP/1.1 200 OK Date: Sat, 02 Feb 2008 12:38:54 GMT Server: Apache/2.2.6 (Debian) Last-Modified: Tue, 20 Aug 2002 01:51:30 GMT ETag: "40c881-2dff-3a89aed4fec80" Accept-Ranges: bytes Content-Length: 11775 Connection: close Content-Type: chemical/x-pdb Connection closed by foreign host. > Admittedly Google sends the DTD as application/octet-stream > instead of application/xml-dtd, but that's not "chemical". Yeah, the Google server is not the misconfigured one. > 5 - All link rev="made" are reported as errors. The current HTML5 draft obsoletes rev, because rev is rare but when it is used, it is most often used wrong. rev='made' is the exception, but rel='author' is the permitted way of communicating the same thing. > 6 - Link elements without title are reported as errors even for > "existing content" where that's not required, and arguably > pointless for some relations including "made" or "author". I can't reproduce this problem. Do you have a link to a test case demonstrating this? > 7 - Validator.nu test aborted before the end of my test suite. Thank you for your feedback. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Saturday, 2 February 2008 12:45:35 UTC