Re: XML namespaces on the Web from James Graham on 2009-11-17 (public-xml-core-wg@w3.org from November 2009)

From: James Graham <jgraham@opera.com>
Date: Tue, 17 Nov 2009 17:13:46 +0000
To: Krzysztof (unknown charset) Maczyński <1981km@gmail.com>
CC: Aryeh Gregor <Simetrical+w3c@gmail.com>, public-html@w3.org, public-xml-core-wg@w3.org
Message-ID: <4B02D9EB.4040707@opera.com>

Krzysztof Maczyński wrote:
> Dear WGs, (CCing public-xml-core-wg@w3.org.)
>
>>> Moreover something what is appropriate for web -- non-draconian
>>> error handling and error recovery -- is not necessary appropriate
>>> for other domains -- if you use XML for business data interchange
>>> draconian error handling makes much more sense.
>> Parsers could be permitted to use draconian error handling at user
>> request.  Then groups that don't want it don't have to have it,
>> while groups that want it can have it.  The current situation gives
>> us no standardized XML-like data format without draconian
>> error-handling. This is a problem unless HTML is really the only
>> use-case for non-draconian error-handling, which I think is very
>> unlikely.  For instance, I've been told some widely-used RSS
>> readers have seen fit to implement error recovery -- which must
>> currently be completely non-interoperable because of the lack of
>> standardization here.
> Both on the Web and elsewhere there are circumstances warranting
> strict or lax parsing. This was already a highly debated point when
> XML was designed. Already then we knew that for dissenting opinions
> usually a good solution is to include both ways things can work and a
> switch. Having the experience of over 10 years, it's clear that the
> needs of both sides are valid and not going away [1]. Therefore I'd
> like to propose XML 1.2 with a pseudo-attribute parse accepting
> values strict and lax added to the XML and text declaration. strict
> would do what parsers currently do (unifying XML 1.0 Fifth Edition
> with XML 1.1 Second Edition in some sensible way) and lax would use
> an algorithm based on Anne van Kesteren's draft, but returning an
> Infoset.

One of the purposes of error recovery (for web content) is to not
punish end users for a class of easy-to-miss (and often harmless) bugs
in the system producing the content. Making that error recovery depend
on the author specifically opting in misses the point somewhat.

It makes much more sense to follow the HTML5 model where the client
gets to determine the parsing mode. Systems where a syntax-level error
should be fatal would switch the parser to the strict mode whereas
systems such as web browsers would switch the parser to the graceful
recovery mode.

If web authors are interested in QAing their content using a strict
parser they could do that either using a tool such as a validator or
using a UA-specific setting to change the parser mode.

Received on Wednesday, 18 November 2009 12:43:46 UTC