- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Sat, 2 Dec 2006 14:11:14 +0200
On Dec 2, 2006, at 14:02, Elliotte Harold wrote: > Lachlan Hunt wrote: > >> HTML and XML have significantly different parsing requirements and >> they absolutely must be treated as significantly different file >> formats. Any attempt to treat them as the same format is an >> extremely bad idea. > > That's only true to the extent that some people seem to insist on > making them needlessly different. HTML is tantalizingly close to > well-formed XML. They both derive from SGML. They both use angle > bracketed tags. They both define a tree structure. Indeed in many > cases an HTML document is an XML document. But the point is that the text/html processing model has to work with the real Web where not all documents are well-formed. > This enables the use of the very powerful XML toolchain for > processing HTML. You can use the toolchain, except for the XML processor itself, as I have explained before. > What I don't understand is why some members of this working group > is so dead set on actively preventing HTML from being XML. The non- > draconian error handling I understand. But why are you disappointed > that <!DOCTYPE html> is well-formed XML? Why the active hostility > to well-formedness? To make a conformance checker not accidentally let MIME type mistakes silently pass in some cases. -- Henri Sivonen hsivonen at iki.fi http://hsivonen.iki.fi/
Received on Saturday, 2 December 2006 04:11:14 UTC