WEK: Comments on XHTML 1.1 5 Jan 2000 Draft (and related specs) from W. Eliot Kimber on 2000-01-25 (www-html@w3.org from January 2000)

From: W. Eliot Kimber <eliot@isogen.com>
Date: Mon, 24 Jan 2000 21:48:19 -0500 (EST)
To: www-html@w3.org
Message-id: <388D0EE6.CF12A750@isogen.com>
[My appologies if this gets posted twice: I sent it on Sat 22 Jan but
did not find it in the archives.]

Here are my comments on the 5 Jan draft:

3.1.1 Strictly Conforming Documents

Per the recent discussion on DOCTYPE declarations, constraint number 4
is unnecessary and ill advised.

It is unnecessary because constraint 3 (required namespace declaration)
is sufficient to unambiguously assert that the document is an XHTML
document.

As written in the 1.1 spec, the wording does not appear to require an
external identifier at all. If it is intended to require an external
identifier, it should be a normative URI, with all mention of a PUBLIC
identifier removed. PUBLIC identifiers are fundamentally bogus and
redundant with system identifiers (cf. Tim B-L's paper arguing that
there is no fundamental difference between URLs and URNs).  If an
external identifier is not required, then the DOCTYPE declaration should
not be required either (it is implicitly required for any document that
wants to use ID attributes as there is no other way to declare them).
[See notes below on modularization spec.]

In any case, I stand by my argument that requiring the use of *external*
declaration subsets is unnecessary and inappropriate.

3. The XHTML 1.1 Document Type

This statement: 

"An implementation of this document type as an XML DTD is defined in
Appendix C [p.19] ."

Is exactly correct: the DTD declarations provided in appendix C
represent one of an infinite number of possible implementations of the
abstract concept that is the XHTML document type. Given that there will
certainly be other implementations (e.g., XML Schema), XHTML cannot
require the use of the appendendix C declarations.

C.1 SGML Open Catalog Entry

Given that the use of PUBLIC identifiers is bogus per the comment above,
the inclusion of the catalog is unnecessary and inappropriate, as it
doesn't mean anything except in the context of a particular system, so
it cannot be normative under any circumstances.

At best, it should be an informative annex or an informative note.

Also, since it is defining SGML-only stuff, it seems inappropriate for
an *XML* specification. It's fine as an informative annex for users of
SGML-only systems who want to be able to process XHTML documents.

-------------------------
XHTML Modularization Spec

3.3 XHTML Family User Agent Conformance

Constraint 4 needs to clarify whether the term "render" as used here
means "enable processing by the controlling style sheet" or "make
visible to the reader". A controlling style sheet may choose to hide an
unrecognized element that the XHTML processor is required to pass
through to the rendition component of the browser.  This constraint
could be read as "must always make visible".

3.4 Naming Rules

This is only compounding the bogosity of FPIs by imparting semantics to
the individual components of identifiers. I understand the intent of the
XHTML designers and it is admirable--they are trying to solve a very
difficult problem, but unfortunately, the tools they are trying use are
simply not up to the task.

The requirement this section is trying to satisfy is the ability to
state unambiguously what set of rules a specialized document type is
derived from (and, presumably, conforms to). That is what architectures
do.

An architecture (in the ISO/IEC 10744 sense) does just what XHTML is
doing: it defines a core set of abstract rules, including element types,
and policies for how those rules can be specified. The architecture is
documented in just the way that XHTML is documented. It provides
normative names for the architecture and direct ways of identifying
optional facilities of the architecture (i.e., the different XHTML
modules).

The architecture use declaration then binds a document to the
architecture *as an abstract set of rules*, to the XML syntax
constraints for the architecture (the "architectural DTD", i.e., the
XHTML DTDs published in the various XHTML specs), and specifies the
names of optional features used by the document.

This is a simple syntactic mechanism that makes it 100% clear what the
intent of a particular document is without having to do the sorts of
things the current clause 3.4 tries to do with FPIs (things that are
completely unjustified and unsupported by ISO 9070).

Architectures can also be built into layers so that one architecture can
specialize another. That is, I might define my own profile of XHTML by
defining an architecture that itself then defines a combination of
specific XHTML modules.

NOTE: I am not necessarily suggesting that XHTML use the architecture
use declaration syntax defined in ISO/IEC 10744 (as ammended) (although
it is a perfectly fine syntax, use of PIs notwithstanding). I am
suggesting that XHTML could define its own *element-based* syntax for
such declarations (and thus set a precendent for the XML community
generally) or the W3C could define a general mechanism as a separate
document.

Such a declaration requires three parts, so it should not be difficult
to get concensus on a syntax:

1. A reference to the overall architecture as an abstract set of rules
and policies. This is normally done by reference to the governing
normative documentation, e.g. XHTML 1.1 in this case. Any standard
should always have a normative, persistent identifier, either because a
URN has been assigned by the governing body or because the spec is
maintained in electronic form in a responsible way (i.e., the W3C's
maintenance of its specs as well-managed URLs).

2. A reference to any necessary machine-readable rules needed for
validation (i.e.,  the XHTML DTDs)

3. A specification of the optional features to turn on or off (if any).

This could be defined as a natural extension of the existing namespace
use declaration syntax, which already provides item 1 (or at least can
be viewed as providing item 1--it doesn't actually say what the semantic
purpose of the thing the URI points to is).

I feel very strongly that without a mechanism of this type, XHTML will
provide a completely unmanagable mess of options that no-one will ever
be able to use or validate with confidence.  This is not the fault of
the XHTML designers, but an inherent weakness in the existing XML
mechanisms for document type specification, modularatization, and
specialization.

I would feel much better if, for example, XHTML was normatively
specified using something like UML where modularization is well defined
and up to the task and where the mapping from a UML model to an specific
syntax can be clearly documented. [Please see my draft paper on using
UML to define document types,
<http://www.drmacro.com/hyprlink/uml-dtds.pdf>.] You will be building a
fragile house of cards.

The fact that there would be no tools that implement this syntax
initially is irrelevant (there are no existing tools that would
implement the currently-stated rules for understanding XHTML-specific
FPIs either): the issue is assertion of type membership that can be
verified and recognized by processors. If the syntax is provided, the
support for use and validation will follow quickly.

Clause 4. Abstract Models

This is a nice definition, but it is missing prose documentation of the
*semantics* of the element types. I think that the document should not
rely on an assumption of prior knowledge. This document is defining a
new document type (even if it is based on existing document types) and
should provide the normative documentation of the element types so that
there is no ambiguity about meaning (to the degree that can be done in
prose, of course).

Cheers,

Eliot
Received on Tuesday, 25 January 2000 03:59:48 UTC