Re: Comments on HTML WG face to face meetings in France Oct 08

Boris Zbarsky wrote:

> XML does define what callbacks happen up to the point where the error is 
> detected.  It does not define what should happen with the data after 
> that point.  We seem to agree on this, right?

Not the first statement, no. The XML specification does not define any 
callbacks. It in fact says very little about what a conforming XML 
parser must report to a client application. There are a few pieces of 
information it must report, but these are almost accidents in the spec. 
For instance, "An XML processor  MUST always pass all characters in a 
document that are not markup through to the application." However 
there's no requirement on such a processor to report element names or 
boundaries.

Now that I think about it in this context, I suspect the XML spec could 
have been much better written in two parts: part 1 to define the pure 
syntax of what constitutes a legal (well-formed) XML document, and part 
2 to define processor behavior on such a document. As is, the processor 
behavior is surprisingly underdefined compared to what most people 
expect. Years later the infoset was created to attempt to address this 
deficiency, but unfortunately it was designed as a data model rather 
than a set of minimum requirements on XML processors. Furthermore it was 
incompatible with existing data models such as XPath and DOM, and did 
not properly align with XML 1.0. :-(

However one thing the XML 1.0 spec did get right is that it almost 
completely separated syntax from semantics (with the exception of just a 
couple of attributes). It is a far more modular, extensible design and 
one HTML 5 would do well to adopt.

-- 
Elliotte Rusty Harold  elharo@metalab.unc.edu
Refactoring HTML Just Published!
http://www.amazon.com/exec/obidos/ISBN=0321503635/ref=nosim/cafeaulaitA

Received on Tuesday, 18 November 2008 16:26:18 UTC