- From: Elliotte Harold <elharo@metalab.unc.edu>
- Date: Tue, 18 Nov 2008 08:25:40 -0800
- To: Boris Zbarsky <bzbarsky@MIT.EDU>
- Cc: public-html <public-html@w3.org>, www-tag@w3.org
Boris Zbarsky wrote: > XML does define what callbacks happen up to the point where the error is > detected. It does not define what should happen with the data after > that point. We seem to agree on this, right? Not the first statement, no. The XML specification does not define any callbacks. It in fact says very little about what a conforming XML parser must report to a client application. There are a few pieces of information it must report, but these are almost accidents in the spec. For instance, "An XML processor MUST always pass all characters in a document that are not markup through to the application." However there's no requirement on such a processor to report element names or boundaries. Now that I think about it in this context, I suspect the XML spec could have been much better written in two parts: part 1 to define the pure syntax of what constitutes a legal (well-formed) XML document, and part 2 to define processor behavior on such a document. As is, the processor behavior is surprisingly underdefined compared to what most people expect. Years later the infoset was created to attempt to address this deficiency, but unfortunately it was designed as a data model rather than a set of minimum requirements on XML processors. Furthermore it was incompatible with existing data models such as XPath and DOM, and did not properly align with XML 1.0. :-( However one thing the XML 1.0 spec did get right is that it almost completely separated syntax from semantics (with the exception of just a couple of attributes). It is a far more modular, extensible design and one HTML 5 would do well to adopt. -- Elliotte Rusty Harold elharo@metalab.unc.edu Refactoring HTML Just Published! http://www.amazon.com/exec/obidos/ISBN=0321503635/ref=nosim/cafeaulaitA
Received on Tuesday, 18 November 2008 16:26:18 UTC