- From: Michael Day <mikeday@yeslogic.com>
- Date: Sun, 04 Mar 2007 22:14:19 +1100
Hi Julian, > What, except efficiency, prevents you from parsing the whole file with > an XML parser? If it parses, it is XML. Otherwise it isn't. This approach would suffer from the opposite problem: documents that the author intended to be treated as XML would be treated as HTML if there was a single well-formedness error anywhere in the document. The resulting behaviour would be quite confusing for users, as an XHTML file containing SVG and MathML content would suddenly stop working if a tag was left unclosed. However, since the file would probably still parse correctly as HTML, especially if the unclosed tag was something like <img> or <br>, the user might not get any error messages relating to the well-formedness error. Instead, they could get error messages relating to the unknown SVG and MathML tags in their "HTML" document. Our heuristics are an attempt to guess the intentions of users. Specifying an XML declaration or other XML-specific content is an indication that the document should be treated as XML. In the absence of any XML-specific signs, a .html file really has to be treated like a HTML document, even if it would potentially be successfully parsed by an XML parser. Any other policy would appear to lead to very confusing behaviour. Best regards, Michael -- Print XML with Prince! http://www.princexml.com
Received on Sunday, 4 March 2007 03:14:19 UTC