W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2000

[BUGFIX] Tidy supplies obsolete DOCTYPE

From: J. David Bryan <jdbryan@acm.org>
Date: Fri, 24 Mar 2000 11:47:13 -0600
To: HTML Tidy List <html-tidy@w3.org>
Message-ID: <OFDF596CCB.12DA4C88-ON8625688E.001EC63E@rfdinc.com>

This report is for the Tidy version of 13th January 2000.

When Tidy is asked to supply a DOCTYPE (e.g., with the configuration option
"doctype: strict"), it will supply one for HTML 4.0, which is obsolete.

The HTML 4.01 specification says, "This document obsoletes previous
versions of HTML 4.0...W3C recommends that user agents and authors (and in
particular, authoring tools) produce HTML 4.01 documents rather than HTML
4.0 documents."  Therefore, Tidy should generate DOCTYPEs with the 4.01

For example, given a "bug.html" file containing:

  <title>Bug test</title>  <body>
  <p>Test file.</p>  </html>

...then running:

  tidy --doctype strict bug.html

...will produce a file with the HTML 4.0 DTD.

The error is in lexer.c, lines 51-67):

  struct _vers
      char *name;
      char *voyager_name;
      char *profile;
      int code;
  } W3C_Version[] =
      {"HTML 2.0", "XHTML 1.0 Strict", voyager_strict, VERS_HTML20},
      {"HTML 3.2", "XHTML 1.0 Transitional", voyager_loose, VERS_HTML32},
      {"HTML 4.0", "XHTML 1.0 Strict", voyager_strict, VERS_HTML40_STRICT},
      {"HTML 4.0 Transitional", "XHTML 1.0 Transitional", voyager_loose,
      {"HTML 4.0 Frameset", "XHTML 1.0 Frameset", voyager_frameset,
      {"HTML 4.01", "XHTML 1.0 Strict", voyager_strict,
      {"HTML 4.01 Transitional", "XHTML 1.0 Transitional", voyager_loose,
      {"HTML 4.01 Frameset", "XHTML 1.0 Frameset", voyager_frameset,

Because the HTML 4.0 and 4.01 DOCTYPE strings carry the same internal
version flags (e.g., VERS_HTML40_STRICT), Tidy uses the first string
encountered with the desired version flag when generating the requested
DOCTYPE.  As the HTML 4.0 strings are first, they are used in preference to
the 4.01 strings.  Placing the 4.01 strings ahead of the 4.0 strings solves
the problem.

                                      -- Dave Bryan
Received on Friday, 24 March 2000 13:12:57 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:47 UTC