- From: Jelks Cabaniss <jelks@jelks.nu>
- Date: Sat, 24 Aug 2002 00:05:55 -0400
- To: <html-tidy@w3.org>, "'Tidy Development List'" <Tidy-develop@lists.sourceforge.net>
[I discussed this briefly off-list several weeks ago with Charles Reitzel. He suggested bringing it up here for discussion.] The 'doctype' option is flawed as currently implemented. It can take one of six values: auto, omit, strict, loose, transitional, or a user specified FPI (i.e. a string). This is does not scale well, now or in the future. 'auto' and 'omit' need not concern us here, so we are left to discuss the remaining four, three of which are an enum (and two of those are synonyms!), and the other a string. Ouch. Plus, in only being able to choose between 'strict', 'transitional' ('loose' means exactly the same thing as 'transitional'), we have effectively frozen ourself in time to 1998, since those choices are only applicable to HTML 4.x and XHTML 1.0. To determine which syntax to use, Tidy also has to look at the 'output-xhtml' option, which if set to 'no' (the default), will use the HTML 4.01 declaration. So, want Tidy to output XHTML Basic, or XHTML 1.1? Forget it. Want to use your own custom DTD? Forget it -- unless you're using some roll-your-own SGML-based HTML along with a "catalog" to look up the FPI. Why? Because the 'doctype' option offers no "user specified SYSTEM ID" like it does a "user specified FPI". For the SGML (and thus "classic" HTML) Doctype declaration, the SYSTEM identifier is optional, the PUBLIC one required. For XML, the reverse is true: FPIs are optional, SYSTEM IDs are required. (!) So, what to do? A proposal: 1) Two new options, for custom declarations: doctype-public // FPI doctype-system // SYSTEM ID (i.e, URL) These have the same name (and purpose!) as the corresponding attributes in XSLT's <xsl:output ...> element. 2) A third new option, an enum, similar in function to the current 'doctype' option: html-doctype // no name collision with 'doctype' It could take one of the following values: auto omit html-32 html-4 html-4-transitional html-iso xhtml-10 xhtml-10-transitional xhtml-basic xhtml-11 xhtml-20 // we can wait on this one. (Have I left out any other "standard" doctypes?) Authors would choose between 2 and 1 above: "from the menu" or "build-your-own". If option 2 were specified, any of the values beginning with 'xhtml' would turn on (and override!) 'output-xhtml'. Specifying 'output-xhtml' explicitly is only needed with option #1. Also, option #1 above would work for generic XML. [Note: Making these changes would in effect deprecate Tidy's current 'doctype' option. But leave 'doctype' in and let it continue to do as it does now, because it's no doubt in wide use and a lot of things may depend on it. Just ignore 'doctype' if any of the new options are set.] I can provide the FPIs and SYSTEM IDs for option #2 if necessary. IMO, FWIW, and HTH, :) /Jelks
Received on Saturday, 24 August 2002 00:06:22 UTC