polyglot XHTML/HTML and <!DOCTYPE html about:legacy-compat>

I'm working on addressing comments on my change proposal for ISSUE-56 versioning, and  I'm puzzled about the functionality
around:

<!DOCTYPE html SYSTEM "about:legacy-compat">

which my change proposal had left in as recommended; would like some advice here. If one wanted to use a generic XML tool chain to develop and deploy such things in polyglot (both XHTML and text/html valid) documents, wouldn't the tools hiccup if the URI isn't actually resolvable?  If this was discussed before, I can't find it in the (extensive) mail archives.

 I've browsed through XML editors and found many discussions of the use of DOCTYPE in their application to editing (X)HTML, e.g.:

http://www.altova.com/download/43/SpyManual43.pdf
http://www.oxygenxml.com/forum/topic2811.html
http://help.eclipse.org/galileo/index.jsp?topic=/org.eclipse.wst.xmleditor.doc.user/topics/tedtdoc.html
http://www.xmlmind.com/xmleditor/_distrib/doc/rngsupport/specifying_a_schema.html
https://collab.itc.virginia.edu/access/wiki/site/c06fa8cf-c49c-4ebc-007f-482de5382105/jedit%20xml%20editor.html
http://webdesign.about.com/od/xhtml/a/aa011507.htm
http://xmlwriter.net/xml_guide/doctype_declaration.shtml

My reading of these is that, if there is a SYSTEM identifier, many of these editors will attempt to fetch a DTD from that URI. Am I misunderstanding?

I know about:legacy-compat was introduced for the XSLT use case, but what about the XML-editor-producing-XHTML  case? Since there is a "legacy compatibility" issue, aren't these also legacy compatibility use cases?

Letting the SYSTEM identifier in valid HTML documents actually be something which _could_ be fetched seems like the simplest fix, giving the XML editor use case as a rationale.   There is no DTD for HTML5 and no plans to produce one. There is a good explanation for why a DTD is insufficient for conformance checking for HTML5. However, allowing editors to use and define and carry DTDs and refer to them in valid documents.... is there a reason why doing that would be harmful?

(I think the change proposal needs to reference 9.2.5.4 The "initial" insertion mode which gives a long list of DOCTYPEs to avoid.)

Larry
--
http://larry.masinter.net

Received on Saturday, 2 January 2010 18:52:28 UTC