W3C home > Mailing lists > Public > public-html@w3.org > January 2009

Re: ISSUE-54: doctype-legacy-compat

From: Philip Taylor <pjt47@cam.ac.uk>
Date: Fri, 16 Jan 2009 13:57:12 +0000
Message-ID: <49709238.6080102@cam.ac.uk>
To: Sam Ruby <rubys@us.ibm.com>
CC: HTML WG <public-html@w3.org>

Sam Ruby wrote:
> [...]
>>From a technical perspective, here is my understanding of the problem.  The
> following string is the smallest and simplest string that will trigger
> standards compatibility mode in all browsers (there was some confusion over
> this, but that was resolved[5]) and can be produced by all known tools.
> <!DOCTYPE html "">

Which tools are "all known tools"?

There are tools which have an HTML4 or XHTML1.0 doctype hardcoded, and 
we can't do anything about them, so I assume they must be excluded.

Looking at the source code for TagSoup 
(<http://home.ccil.org/~cowan/XML/tagsoup/>), I believe its "XMLWriter" 
(actually an XML-incompatible HTML writer) can only output:

   <!DOCTYPE html SYSTEM "">
   <!DOCTYPE html PUBLIC "x" "">

(where "x" means at least one character, and "" is at least zero).

XSLT's HTML output method 
<http://www.w3.org/TR/xslt-xquery-serialization/#HTML_DOCTYPE>) can output:

   <!DOCTYPE html PUBLIC "">
   <!DOCTYPE html PUBLIC "" "">
   <!DOCTYPE html SYSTEM "">

though I assume implementations will differ to some extent.

Apache's HTML serialiser 
seems to be able to generate:

   <!DOCTYPE html>
   <!DOCTYPE html PUBLIC "">
   <!DOCTYPE html PUBLIC "" "">
   <!DOCTYPE html SYSTEM "">

Genshi's HTMLSerializer 
(<http://genshi.edgewall.org/browser/trunk/genshi/output.py#L456>) does:

   <!DOCTYPE html>
   <!DOCTYPE html PUBLIC "">
   <!DOCTYPE html PUBLIC "" "">
   <!DOCTYPE html SYSTEM "">

The Perl module XML::Handler::HTMLWriter 
can do:


This list could (and probably should?) be extended further, but I don't 
know how to easily find more HTML serialiser libraries. Given the 
current list:

<!DOCTYPE html PUBLIC ""> would only help XSLT.

<!DOCTYPE html PUBLIC "(something)"> would help XSLT and 

<!DOCTYPE html SYSTEM ""> and <!DOCTYPE html PUBLIC "(something)" ""> 
would help XSLT and TagSoup.

<!DOCTYPE html SYSTEM "(something)"> and <!DOCTYPE html PUBLIC 
"(something)" "(something)"> would help XSLT and TagSoup and 

So if the goal is to work in as many tools as possible, the shortest 
option is <!DOCTYPE html SYSTEM "x">.

Philip Taylor
Received on Friday, 16 January 2009 13:57:47 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:41 UTC