- From: James Graham <jgraham@opera.com>
- Date: Tue, 17 Nov 2009 17:13:46 +0000
- To: Krzysztof (unknown charset) Maczyński <1981km@gmail.com>
- CC: Aryeh Gregor <Simetrical+w3c@gmail.com>, public-html@w3.org, public-xml-core-wg@w3.org
Krzysztof Maczyński wrote: > Dear WGs, (CCing public-xml-core-wg@w3.org.) > >>> Moreover something what is appropriate for web -- non-draconian >>> error handling and error recovery -- is not necessary appropriate >>> for other domains -- if you use XML for business data interchange >>> draconian error handling makes much more sense. >> Parsers could be permitted to use draconian error handling at user >> request. Then groups that don't want it don't have to have it, >> while groups that want it can have it. The current situation gives >> us no standardized XML-like data format without draconian >> error-handling. This is a problem unless HTML is really the only >> use-case for non-draconian error-handling, which I think is very >> unlikely. For instance, I've been told some widely-used RSS >> readers have seen fit to implement error recovery -- which must >> currently be completely non-interoperable because of the lack of >> standardization here. > Both on the Web and elsewhere there are circumstances warranting > strict or lax parsing. This was already a highly debated point when > XML was designed. Already then we knew that for dissenting opinions > usually a good solution is to include both ways things can work and a > switch. Having the experience of over 10 years, it's clear that the > needs of both sides are valid and not going away [1]. Therefore I'd > like to propose XML 1.2 with a pseudo-attribute parse accepting > values strict and lax added to the XML and text declaration. strict > would do what parsers currently do (unifying XML 1.0 Fifth Edition > with XML 1.1 Second Edition in some sensible way) and lax would use > an algorithm based on Anne van Kesteren's draft, but returning an > Infoset. One of the purposes of error recovery (for web content) is to not punish end users for a class of easy-to-miss (and often harmless) bugs in the system producing the content. Making that error recovery depend on the author specifically opting in misses the point somewhat. It makes much more sense to follow the HTML5 model where the client gets to determine the parsing mode. Systems where a syntax-level error should be fatal would switch the parser to the strict mode whereas systems such as web browsers would switch the parser to the graceful recovery mode. If web authors are interested in QAing their content using a strict parser they could do that either using a tool such as a validator or using a UA-specific setting to change the parser mode.
Received on Wednesday, 18 November 2009 12:43:46 UTC