- From: Philip Taylor <pjt47@cam.ac.uk>
- Date: Fri, 16 Jan 2009 13:57:12 +0000
- To: Sam Ruby <rubys@us.ibm.com>
- CC: HTML WG <public-html@w3.org>
Sam Ruby wrote: > [...] > >>From a technical perspective, here is my understanding of the problem. The > following string is the smallest and simplest string that will trigger > standards compatibility mode in all browsers (there was some confusion over > this, but that was resolved[5]) and can be produced by all known tools. > > <!DOCTYPE html ""> Which tools are "all known tools"? There are tools which have an HTML4 or XHTML1.0 doctype hardcoded, and we can't do anything about them, so I assume they must be excluded. Looking at the source code for TagSoup (<http://home.ccil.org/~cowan/XML/tagsoup/>), I believe its "XMLWriter" (actually an XML-incompatible HTML writer) can only output: <!DOCTYPE html SYSTEM ""> or <!DOCTYPE html PUBLIC "x" ""> (where "x" means at least one character, and "" is at least zero). XSLT's HTML output method (<http://www.w3.org/TR/xslt#section-HTML-Output-Method>, <http://www.w3.org/TR/xslt-xquery-serialization/#HTML_DOCTYPE>) can output: <!DOCTYPE html PUBLIC ""> <!DOCTYPE html PUBLIC "" ""> <!DOCTYPE html SYSTEM ""> though I assume implementations will differ to some extent. Apache's HTML serialiser (<http://xml.apache.org/xalan-j/apidocs/org/apache/xml/serializer/ToHTMLStream.html>) seems to be able to generate: <!DOCTYPE html> <!DOCTYPE html PUBLIC ""> <!DOCTYPE html PUBLIC "" ""> <!DOCTYPE html SYSTEM ""> Genshi's HTMLSerializer (<http://genshi.edgewall.org/browser/trunk/genshi/output.py#L456>) does: <!DOCTYPE html> <!DOCTYPE html PUBLIC ""> <!DOCTYPE html PUBLIC "" ""> <!DOCTYPE html SYSTEM ""> The Perl module XML::Handler::HTMLWriter (<http://cpansearch.perl.org/src/MSERGEANT/XML-Handler-HTMLWriter-2.01/HTMLWriter.pm>) can do: <!DOCTYPE HTML PUBLIC "x"> <!DOCTYPE HTML PUBLIC "x" "x"> <!DOCTYPE HTML SYSTEM "x"> This list could (and probably should?) be extended further, but I don't know how to easily find more HTML serialiser libraries. Given the current list: <!DOCTYPE html PUBLIC ""> would only help XSLT. <!DOCTYPE html PUBLIC "(something)"> would help XSLT and XML::Handler::HTMLWriter. <!DOCTYPE html SYSTEM ""> and <!DOCTYPE html PUBLIC "(something)" ""> would help XSLT and TagSoup. <!DOCTYPE html SYSTEM "(something)"> and <!DOCTYPE html PUBLIC "(something)" "(something)"> would help XSLT and TagSoup and XML::Handler::HTMLWriter. So if the goal is to work in as many tools as possible, the shortest option is <!DOCTYPE html SYSTEM "x">. -- Philip Taylor pjt47@cam.ac.uk
Received on Friday, 16 January 2009 13:57:47 UTC