Re: Comments on HTML WG face to face meetings in France Oct 08

On Dec 5, 2008, at 13:28, Henry S. Thompson wrote:

> Net-net: In the example you offer, we have a data object which is not
> well-formed, in terms of the XML specification.  This is a 'fatal
> error', and MUST be signalled as such to the application.
> Furthermore, as the above makes clear, the data object you offer in
> your example is _not_ an XML document.  Neither the CSS spec. nor the
> XML stylesheet spec. specify the behaviour of conforming processors
> when given data objects to process which are not XML (or HTML)
> documents.
>
> So it appears to me that if you (or Boris) have a grievance, it is
> with the CSS and/or the XML Stylesheet specs, not the XML spec.

Are you suggesting that it's more reasonable to put the burden of  
dealing with ill-formed streams labeled as XML onto each and every  
specification that references the XML spec than on the XML spec itself?

> But I
> have to say I don't think you have _much_ of a grievance -- it's not
> unusual, or particularly unreasonable, for specs to express
> conformance in positive terms: that is, to say that conformance means
> "if you get [some kind of input], you behave in [some kind of way]."
> What you do in other circumstances is not constrained _for conformance
> to the spec. in question_.

That's indeed not unusual, but it's a serious problem that Web  
platform specs should avoid. Vendor lock-in happens when complementary  
products and services rely on the undocumented behaviors of a product  
that are difficult for substitute products to replicate.

> The XML spec. also leaves it open to application specs. to go further.
> Because the XML spec. says that after a fatal error
>
>  "the [XML] processor MAY continue processing the data to search for
>   further errors and MAY report such errors to the application. In
>   order to support correction of errors, the processor MAY make
>   unprocessed data from the document (with intermingled character
>   data and markup) available to the application",
>
> an application spec. may _require_ conformant implementations to use
> only XML processors which _do_ all the things listed above which a
> processor MAY do, and go on to specify what kind of behaviour then
> ensues.
>
> Other applications may prefer to say "Well-formed XML documents are
> handled [like this], anything else is an error and will not be
> processed."
>
> Surely both these approaches are reasonable, and therefore the XML
> spec. is correct to leave both options open to application designers.
> Given its nature as a meta-language, and its consequent place in
> implementation stacks, surely to do anything else would have been a
> mistake.

I believe this is a latter-day interpretation that has sprung up now  
that Draconian failure has become unpopular but it is neither  
supported by the record of drafting the XML spec nor supported by the  
understanding of XML processor developers as evidenced by their actions.

The way the issue was formulated for vote makes it clear that the  
intent of the XML spec is to prohibit parse events (other than error  
reports) after the first fatal error:
http://lists.w3.org/Archives/Public/w3c-sgml-wg/1997May/0079.html

This is also how the spec has been interpreted by people who were  
involved in the vote and went on to write parsers--regardless of which  
way they voted.

Even though the spec allows the XML processor to make the remaining  
unparsed data available to the application, I think it's not  
reasonable that the intent of the spec were that the application also  
contain another unspecified parser for parsing that data further  
instead of just dumping it for inspection. Certainly, such an  
implementation wouldn't make sense; if one wanted the functionality,  
one would want it in the XML processor itself--not in a separate module.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Friday, 5 December 2008 14:06:45 UTC