Conformance vs. validity (was Re: XHTML module for addresses)

[ personal opinion, not representing the HTML WG's opinion ]

Karl Dubost <karl@w3.org> wrote:

> At 17:09 +0900 2003-05-16, Masayasu Ishikawa wrote:
> >Here's my personal experiment to include ContactXML inside XHTML2's
> >address element (needs Mozilla 1.0 and later but not 1.4b):
> >
>      http://www.w3.org/People/mimasa/test/xhtml2/hybrid#contactxml
> 
> Great masayasu!
> 
> We still have to deal about multi-namespaces documents. What is the 
> notion of Conformance or validity in this case. Something to dig.

As for document conformance, at the moment the XHTML 2.0 spec only talks
about "strictly conforming XHTML 2.0 document", which is "a document
that requires only the facilities described as mandatory in this
specification" [1].  In that sense the above document is certainly
not "strictly conforming".  That doesn't necessarily mean that is /
should be the only type of conforming XHTML 2.0 document.

For better or worse, previous versions of XHTML required DTD validity
as part of "strict" conformance.  As such, strictly conforming XHTML
documents are always DTD-valid, but they cannot include anything from
foreign namespaces.  On the other hand, that's not the case in e.g. SVG,
which "allows inclusion of elements from foreign namespaces anywhere
with the SVG content" and also "allows inclusion of attributes from
foreign namespaces on any SVG element" [2].  As such, conforming SVG
document fragments are not always DTD-valid, you have to do some
pre-processing before validating the document fragment as described
in Appendix G.2 [3]:

  * if all non-SVG namespace elements and attributes and all xmlns
    attributes which refer to non-SVG namespaces other than the XLink
    namespace are removed from the given document, and if an
    appropriate XML declaration (i.e., <?xml...?>) is included at the
    top of the document, and if an appropriate document type
    declaration (i.e., <!DOCTYPE svg ... >) which points to the SVG
    DTD is included immediately thereafter, the result is a valid XML
    document.

As I said long time ago, while XHTML's "strict" conformance is overly
limited for extensibility, I didn't think we should require our user
community to do such a complex process before validating a document [4].
Validation should facilitate conformance, and "comforming but not
(DTD-)valid" would confuse many people.  I wanted to find a reasonable
technical solution.

In this particular case, the above document instance is NOT a valid
instance of the "strict" XHTML 2.0 schema in RELAX NG [5], but a valid
instance of the "loose" XHTML 2.0 schema in Modular Namespaces (MNS) [6].
By "loose" I mean the validity of XHTML 2.0 markup is verified but
otherwise foreign elements and attributes (e.g. ContactXML) are
ignored by pruning them before validation.  It is also a valid
instance of the XHTML2+MathML+SVG+EGIX+ContactXML+HLink+RDF+XMLCharEnt
schema in MNS [7], in this case the validity of ContactXML markup and
its context is also verified.

IMHO, each schema has advantages and disadvantages.  Some people may
want to make sure that their XHTML 2.0 documents won't include any
crud, e.g. some proprietary extensions specific to a particular
implementation, and for such usage the "strict" XHTML 2.0 schema
would be useful.  Other people may want to extend their XHTML 2.0
documents with additional elements and attributes from foreign
namespaces but may not want to bother with writing up a schema for
that particular combination.  In which case the "loose" XHTML 2.0
schema would be useful, and in some cases W3C XML Schema could
automate the validation of various namespaces.  Yet another folks may
want to ensure strict validity of their extended XHTML 2.0 documents,
and in which case writing up a "hybrid" schema may be appropriate.
Mechanisms like MNS and ISO DSDL VCSL may be useful in this case.

The conformance definition of XHTML 2.0 should acknowledge these
diverse use cases and should make sure that appropriate conformance
criteria are defined for each case, and those criteria should be
machine-testable as much as possible.  That's why I've been
investigating various approaches.

[1] http://www.w3.org/TR/2003/WD-xhtml2-20030506/conformance.html#strict
[2] http://www.w3.org/TR/SVG11/extend.html#PrivateData
[3] http://www.w3.org/TR/SVG11/conform.html#ConformingSVGDocuments
[4] http://lists.w3.org/Archives/Public/www-html/2002Aug/0211
[5] http://www.w3.org/TR/2003/WD-xhtml2-20030506/xhtml20_relax.html#a_xhtml20_relaxng
[6] http://www.w3.org/People/mimasa/test/schemas/rng/xhtml2.mns
[7] http://www.w3.org/People/mimasa/test/schemas/rng/hybrid.mns

Regards,
-- 
Masayasu Ishikawa / mimasa@w3.org
W3C - World Wide Web Consortium

Received on Monday, 26 May 2003 04:13:49 UTC