[to be sent to www-html-editor@w3.org]
Notes on 2003-10-03 draft of Modularization of XHTML in XML Schema
Dear colleagues,
The undersigned are a task force assigned by the XML Schema WG to
review the 3 October 2003 Last-call draft of Modularization of XHTML
in XML Schema. We regret that the press of other work has caused
these comments to be both late and incomplete; we hope they are
useful nonetheless.
We performed this review on behalf of that WG, but in the interests of
full disclosure we note that the following observations have not been
reviewed formally by the XML Schema WG; such a review would have
delayed them even more and we transmit them to you without that
review, in order to enable you to move forward more quickly.
-Mary Holstege
C. M. Sperberg-McQueen
................................................................
* Editorial note
We found the wording of the introductory sections mildly
confusing. One cause may be the frequent use of the word "this"
without it being wholly clear what the antecedent of the term was
supposed to be. But it's not clear whether that is the only reason
our readers found the following paragraph (and ones like it)
confusing:
This document is the XML Schema implemantaion of abstract modules
defined in XHTML Modularization [XHTMLMOD] and defines conformance
requirements as defined in Conformance Definition section of XHTML
Modularization [XHTMLMOD]
(Note the typographic error in the word "implementation", which also
occurs elsewhere. Also the missing final period.)
Perhaps the paragraph would be clearer if it read
but we are not entirely certain whether this is the same thing or not.
This document is the XML Schema implemantaion of abstract modules
defined in XHTML Modularization [XHTMLMOD]. The conformance
requirements of this specification are the same as those defined
in Conformance Definition section of XHTML Modularization
[XHTMLMOD].
* Status of earlier comments
We are happy to see that the current draft makes better use of the
builtin simple types than did the previous one. We believe better use
of the pattern facet can and should be made to capture the rules
governing datatypes like multi-lengths or RFC 2045 encoding or media
types.
On the other hand, we note with some disappointment that as far as we
can tell the October draft of the document does not respond to several
others of our comments on the previous Last-call draft
(http://www.w3.org/XML/Group/2003/01/xmlschema-notes-on-xhtml-modularization.html)
In particular, the current draft still doesn't exploit substitution
groups at all, doesn't use XHTML for schema-internal documentation,
and doesn't use the 'source' attribute on the xsd:documentation
element to point to normative documentation on the definition of
element types etc.
The current schema does change finalDefault from "#all" to the default
(empty set), but it leaves blockDefault at "#all". Why? This seems
to us a design error.
We also note that it's still deplorably difficult to find one's way
around in the schema, even for those reasonably comfortable with
XML Schema notation.
* Schema errors
We attach two error reports showing the response of the Xeres J and
XSV schema processors when we used them to validate a simple document
(also attached) against your schema. In summary:
- There are several places where you include the same schema
document more than once. While processors are not required to
reject this as an error, the response of Xerces J illustrates
that they are also not required to accept it with equanimity.
We suggest eliminating the double inclusions.
- The schema document xhtml-charent-1.xsd includes several entity
sets for special characters. There is nothing illegal about this
that our reviewers can determine, but (as the response of xsv
illustrates), it does seem to hit a weak spot in some XML
libraries' handling of UTF8 and the numeric character references.
Since the entities are not in fact used by the schema documents,
but are intended to be available in document instances, there is
actually no need to include the entity sets here, and if they
are commented out the schema becomes more usable with xsv.
- Several schema documents refer to attributes in the XML namespace
without importing it: import elements need to be added to
xhtml-attribs-1.xsd, xhtml-blkphras-1.xsd, xhtml-bdo-1.xsd,
xhtml-script-1.xsd, and xhtml-style-1.xsd.
Fixed versions of these files are attached.
- The schema document xhtml11-module-redefines-1.xsd includes an
ambiguous (and therefore non-deterministic) definition of
the model group head.content. When the 'head' element contains
a 'title' element before any 'base' element, the definition given
makes it impossible to know whether an element in HeadOpts.mix
matches the first HeadOpts.mix (before the optional 'base' element)
or the second (after it). An unambiguous form of what we believe
to be intended here would be:
<xs:group name="head.content">
<xs:annotation><xs:documentation>
Redefinition by Base module
</xs:documentation></xs:annotation>
<xs:sequence>
<xs:group ref="HeadOpts.mix" minOccurs="0"
maxOccurs="unbounded"/>
<xs:choice>
<xs:sequence>
<xs:element ref="title"/>
<xs:group ref="HeadOpts.mix" minOccurs="0"
maxOccurs="unbounded"/>
<xs:sequence minOccurs="0">
<xs:element ref="base"/>
<xs:group ref="HeadOpts.mix" minOccurs="0"
maxOccurs="unbounded"/>
</xs:sequence>
</xs:sequence>
<xs:sequence>
<xs:element ref="base"/>
<xs:group ref="HeadOpts.mix" minOccurs="0"
maxOccurs="unbounded"/>
<xs:element ref="title"/>
<xs:group ref="HeadOpts.mix" minOccurs="0"
maxOccurs="unbounded"/>
</xs:sequence>
</xs:choice>
</xs:sequence>
</xs:group>
The unambiguous form makes clear, however, that the redefinition
of 'head.content' is not a restriction of the 'head.content' in
the schema document being redefined: the model in
xhtml-struct-1.xsd does not allow any 'base' elements to occur,
whereas the model given in the redefinition does.
This particular part of the schema seems to need some rethinking.
At this point, we regret to say, the time available for our review of
the draft ran out, and we have not yet achieved any confidence that we
have eliminated all the formal problems in the schema given in the
draft of 3 October.
We believe you should not go forward until the schema errors, at
least, have been corrected.