- From: <noah_mendelsohn@us.ibm.com>
- Date: Sat, 28 Aug 2004 21:07:21 -0400
- To: Norman Walsh <Norman.Walsh@Sun.COM>
- Cc: www-tag@w3.org
I think Norm and Henry are onto a very important issue regarding Infosets:
we should in the Infoset Recommendation do a better job of clarifying
consistency constraints or lack thereof. For example, Henry and others
seem to have inferred that an Infoset with a Doc Info Item indicating
version 1.0 should contain only content serializable as XML 1.0. Norm
suggests otherwise. Except insofar as silence indicates lack of a
constraint, I think the Infoset Rec. can reasonably be read either way.
Indeed, there are other similar and perhaps more insidious points of
confusion. I and other members of the SOAP WG were somewhat surprised to
be shown that the Infoset Rec. nowhere restricts character [children] to
be those allowed in some version of XML. Thus, NULs are allowed in an
Infoset by this interpretation, even though no published version of XML
allows NUL characters. We had to make a late clarification in some of our
SOAP work to handle this (I don't think it made the original Rec., but is
in the mill as an erratum, I think.)
Yet another question is whether [parent]'s must be present. Some have
inferred that an [attribute] is necessarily associated with a parent
element, and that both can eventually trace their ancestory to a [document
information item], which might in turn provide a constraining XML version.
My own reading is that no such constraint is present regarding parents,
and that a Rec such as Schema 1.0 that refers only to [Element Information
Item]s would need an explicit clarification if all elements to be
validated were required to have a doc info ancestor.
The point of this note is not to suggest what the constraints answers are,
if any, but that the Infoset Recommendation should be clarified. If, as
Norm suggests, the intention is indeed to avoid constraints, we should
make that clearer. I wonder whether it would then be worth giving a name
to Infosets that do after all meet certain common constraints. For
example, one might list a set of rules for Infosets "serializable as XML
1.0 documents", "full document infosets" (I.e. those with a [Document
Information Item] or some such. We have a number of Recommendations that
either create or by implication are capable of using synthetic Infosets
that are intended to represent either entire well-formed XML documents or
fragments thereof. As it stands, it's a bit tricky to write such
Recommendations, and constraints such as "[children] must consist of
characters matching the {char} production of XML 1.0" potentially require
restatement in each Recommendation. That seems unfortunate.
--------------------------------------
Noah Mendelsohn
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------
Received on Sunday, 29 August 2004 01:08:51 UTC