[Bug 3573] Validation and invalid schemas from bugzilla@wiggum.w3.org on 2006-08-01 (www-xml-schema-comments@w3.org from July to September 2006)

From: <bugzilla@wiggum.w3.org>
Date: Tue, 01 Aug 2006 23:50:14 +0000
To: www-xml-schema-comments@w3.org
CC:
Message-Id: <E1G8406-0000I8-FJ@wiggum.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=3573

           Summary: Validation and invalid schemas
           Product: XML Schema
           Version: 1.0/1.1 both
          Platform: Macintosh
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Structures: XSD Part 1
        AssignedTo: cmsmcq@w3.org
        ReportedBy: cmsmcq@w3.org
         QAContact: www-xml-schema-comments@w3.org


Section 5.1 of Structures concludes:

    With respect to the processes of the checking of schema structure
    and the construction of schemas corresponding to schema documents,
    this specification imposes no restrictions on processors after an
    error is detected. However assessment with respect to
    schema-like entities which do not satisfy all the above conditions
    is incoherent. Accordingly, conformant processors must not attempt
    to undertake assessment using such non-schemas.

Guided by the New Oxford American Dictionary that came with my
machine, I take 'incoherent' here to mean 'internally inconsistent;
illogical'.

I propose to delete the last two sentences, because I believe they are
in conflict with the rest of the spec, factually inaccurate, and
mistaken in their intent.

At some points (e.g. section 3.8.4, Validation Rule Element Sequence
Valid) the spec makes a point of observing that validation is defined
in such a way as to be possible even with content models which do not
obey the Unique Particle Attribution constraint.  But if validation is
well defined even for content models which do not obey UPA, then there
are at least some invalid schemas with which validation can be
performed in a way which appears to me not internally inconsistent or
illogical, and the spec is at pains to point out the fact.  (To avoid
confusion: by 'invalid schemas' I mean the same things mentioned by
the current text as 'schema-like entities which do not satisfy all the
above conditions'.)

The text quoted above seems to make the claim that if a schema
document has an invalid HTML element inside an xsd:documentation
element (let us say it's an element which is made legal by XHTML 2,
although we are now validating with an XHTML 1 schema), then
attempting to validate documents using components constructed from
that schema will lead to an inconsistency or is illogical.  I don't
see any inconsistency, and it doesn't seem illogical to me to want to
validate with such components, especially in view of the well known
rules of HTML and XHTML regarding behavior of software in the presence
of undeclared elements.

Requiring processors to fail when a schema document is invalid does
not now seem to me the right thing to do here.  In developing 1.0, the
Working Group was indeed unwilling to require them to soldier on
ignoring what they didn't understand, but I don't think it is wise to
*forbid* them to do so. (For that matter, I no longer think it was
wise not to *require* them to do so.)  I find I cannot remember the WG
actively deciding to forbid such behavor, so I can't remember any
arguments brought forward for such a rule.

I bleieve we should delete the two sentences without replacement, in
1.1 as a change to the status quo and in 1.0 as a bug fix.

If WG members feel that some replacement is required, I would propose
that the passage be amended to read:

    With respect to the processes of the checking of schema structure
    and the construction of schemas corresponding to schema documents,
    this specification imposes no restrictions on processors after an
    error is detected. However, any operations performed using
    schema-like entities which do not satisfy all the above conditions
    is outside the scope of this specification.

Optionally, add before the final full stop:

    and is not schema-validity assessment as that term is defined
    here

but I'm inclined to include that final bit.
Received on Tuesday, 1 August 2006 23:50:53 UTC