W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > August 2007

Re: Unserializable documents

From: Alex Milowski <alex@milowski.org>
Date: Thu, 30 Aug 2007 13:49:27 -0700
Message-ID: <28d56ece0708301349s652377c7x7f82fe343664e4b6@mail.gmail.com>
To: public-xml-processing-model-wg@w3.org

On 8/30/07, Richard Tobin <richard@inf.ed.ac.uk> wrote:
>
> I think we agree that we should allow implementations that don't check
> serializability at every step (because it would be expensive) and
> implementations that do, and generate an error (because they really do
> serialize at every step).
>
> If so, we can say that an implementation MAY generate an error in such
> cases, or that it is implementation-defined whether an error is
> generated.  The latter implies that implementations have to document
> which they do.

I think there is a difference between an infoset that can be directly
serialized,
one that requires a bit of fixup for declaring namespaces, and one that
violates well formed rules in some other way (e.g. two attributes of
the same name).

In some cases, the XML Infoset specification handles the well-formed rules
and in others it does not.  For example, you can only have one element child
for a document but the [attributes] child of elements doesn't say that you can't
repeat attribute names.

The issue of 'fixups" is well understood and handled by the serialization
specification.  I think we should try to make well-formed mistakes an
error.

The problem here is that there isn't anything to point at that defines
what well-formed means against an infoset.  We have XML 1.0/1.1 for what
it means for a sequence of unicode characters.

I would be very unfortunate for standard technologies like XSLT 1.0 to
be prevented from working as they can't guarantee that they'll produce
elements with the correct namespace declarations.

> That leaves the question of whether we call such a program (or rather
> program+data) legal or not.  We can either say "the program's legal,
> but implementations may reject it", or "the program's illegal, but
> implementations may accept it".  Or, I suppose, we can not say
> anything about whether the program is legal.

Apart from namespace fixup, I think well-formed errors should be errors
in our language as well.  You should be required to detect and report
them.

-- 
--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."

Bertrand Russell in a footnote of Principles of Mathematics
Received on Thursday, 30 August 2007 20:49:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:54 GMT