Re: Modules, Modularization, and the XHTML Family

One way to approach this issue is to ask the question what the
fundamental functional requirements for a next generation X(HT)ML
modularization framework are. XHTML M12N has always allowed
extensions. The option to define subsetting restrictions on imported
modules is so far absent. From my point of view as a language
designer, this appears (in lack of a better phrasing) mildly
asymmetric, and in my humble opinion, this is a feature worthy of
consideration for inclusion in XHTML M12N 2.0.

As Shane points out, we are not talking about allowing arbitrary
subsetting here: a mechanism would need to be put in place that allows
the module provider to express where subsetting is allowed (or the
inverse, thats a spec design choice). The table example is a good one
- in this scenario, the provider of the table module would be able to
express that while a user of the module is not allowed to remove
table, tr and td, it is possible to have an impact on the content
model of td.

To make the use case for this feature more concrete, let me give a few
real-world examples of subsetting desires that have surfaced within
the ANSI/NISO Z39.86 standard context whilst evaluating the viability
of adopting XHTML M12N 2.0. Some of these are examples from profiles
that pertain to quite specialized document type domains, but some of
them I would say are quite generic too, and can as such be said to
demonstrate the subsetting use case for language designers that strive
to produce highly structured/predictable grammars based on XHTML M12N
2.0.

(Further, these examples incidentally relate to the XHTML2 modules,
but of course a generic subsetting mechanism would be equally relevant
to any module created under the aegis of M12N 2.0, regardless of
whether it is using the XHTML namespace.)

- Allowing only one h element within section
- Allowing only one h element within section, and requiring it to be
the first child of section
- Allowing ul, ol, and dl but not nl [1]
- Allowing h but not h1-h6 (or vice versa)
- Disallowing recursive inlines (abbr inside abbr for example)
- Disallowing Structure class and Text class members to be mixed in a
sibling list (see: current Flow model)
- ... and many examples from attribute collections (such as: allowing
only @xml:id, not @id document-wide, or vice versa)

As far as I can see, these examples favorably match Shanes rule of
thumb "will a document that conforms to this language work correctly
in an xhtml family user agent". Another way to put it is: an XHTML
Family compliant UA would not be able to tell from the document
infoset alone that it was authored against a subsetting schema.

(For clarity, it should also be noted that from a language designers
point of view, the expression of additional restrictions isnt always
about subsetting, but sometimes about plain dis-optionalization
(ouch). Examples: requiring xml:lang on root, requiring di in dl.
Perhaps M12N-current already allows this, in which case (and unless I
have missed the obvious spec fragment where this is spelled out) I
would suggest a light editorial pass to clarify for the average reader
that this is so.)

>What are the risks?
Indeed, finding the sweet spot in terms of dynamicity is not the
simplest thing to do. But, in light of the upside items Shane
mentions, I would argue that its worth a try. Obviously, careful
evaluation and risk assessment would need to be performed on candidate
solutions.

hth, /markus

[1] Note: this is not at all related to the discussion whether nl is
appropriate in the XHTML2 document type, but related only to XHTML
M12N 2.0 as an example of module subsetting to fit a given specialized
document type context.

Received on Monday, 9 February 2009 11:36:24 UTC