Re: A8: INCLUDE/IGNORE marked sections?

On Fri, 4 Oct 1996 15:20:11 -0400 Paul Prescod said:
>At 06:23 PM 10/3/96 CDT, Michael Sperberg-McQueen wrote:
>>A.8 Should XML have INCLUDE and IGNORE marked sections or not?
>>(If this question is answered YES, it leads to a separate question, how
>>to achieve conditional inclusion in XML markup declarations.  This
>>related question is to be decided separately.)
>
>If marked sections are required to be element-synchronous, couldn't
>you make <ENGLISH.IGNORE> and <FRENCH.IGNORE> elements or attributes
>and use the stylesheet to make them disappear at delivery time? I
>think that this sort of thing should fulfill the needs of most people
>without introducing this new feature for parsers to handle.

I think most of those who deal with variant text of this sort agree
that conditional elements or text-variant elements are the best way
to handle textual variation.  Unlike marked sections, such elements
place the display under style-sheet control, so that users can
display the variants of their choice.  In theory, one might do this
with marked sections, as well, but it's a rare display engine that
accepts display-time changes to parameter-entity values.

But marked sections are used in prologs, too, not just instances.

Any multi-flavored DTD (such as HTML 2 -- not just gargantuan DTDs like
the TEI main DTD) needs user control over parse-time customization, to
suppress some elements or select among multiple declarations of elements
or attributes or entities or ....  Currently the TEI and HTML both
handle this using clever sequences of parameter entity declarations,
marked sections, and the occasional eye of newt.

I think it's a hard requirement for XML to allow this kind of parse-time
customization.  If it does, XML can fulfill the goal of being a language
one can *work* in; as Dave Hollander put it early on, a language we can
author in, and edit in, and process in, and publish in without needing a
one-way transform at the publication step, into a seriously dumber
language.  Without parse-time customization of DTDs, XML will be much
less useful, just a write-only publication language, filling the same
niche that HTML currently fills for some of us -- more flexible than
HTML, but not usable as the day-to-day maintenance format for one's
documents.

As far as I can tell, there are these main possibilities for conditional
inclusions in markup declarations:

  - multiple declarations, with rules about which one wins, just like
SGML entity declarations.  I'm not sure this is quite enough, but it
might be.
  - marked sections for INCLUDE/IGNORE, controlled (as now) by
entities the user declares in the subset or in a system-dependent
way (e.g. by command-line option)
  - if the markup declarations use instance syntax (which I think is
an idea whose time is NOW), then we can use the same technique as
Paul Prescod describes:  elements which enclose the conditional
material and the contents of which are processed or not, depending
on the conditions.  For example, something like

    <!ELEMENT cond - - (element | attlist | entity | notation)* >
    <!ATTLIST cond
              entity   ENTITY    #REQUIRED
              val      CDATA     ""
              relop    (eq | gt | lt | ne | def) 'eq' >

(I omit the instance-syntax equivalent, but can supply it if anyone
is really curious)

So a TEI-ish DTD driver might read

  <cond entity='TEI.base' val='prose'>
    <entity name='base.dtd'> ...</entity>
  </cond>
  <cond entity='TEI.base' val='verse'>
    <entity name='base.dtd'> ...</entity>
  </cond>
  <cond entity='TEI.base' val='drama'>
    <entity name='base.dtd'> ...</entity>
  </cond>
  &base.dtd

and an instance might read

  <?XML dtd='tei2.dtd'>
  <?XML localdecls
    <entity name='TEI.base'><local><string val="prose"></local></entity>
  ?>

[Disclaimer -- this is a sketch of one way of handling the local
subset in instance syntax, not of THE way.  If you can think of a better
way, Tim and I want to talk to you.]

I could live with any of these approaches (marked sections, conditional
elements, or multiple declarations--assuming they suffice).  But *some*
method of specifying conditional inclusion has to be there, or XML will
be only a publication medium, not a working medium.  Slightly better
than HTML, but not something we can *work* in; we'll have to work in some
other notation and translate into XML.

  - If we're worried about the load on browsers and other software that
wants to be lightweight, we can define a subset of the C preprocessor
into the language and say servers should support #ifdef and #ifndef and
maybe #if, but that clients need not.  Then the TEI DTD reads

  #if &TEI.base; = 'prose'
    <entity name='base.dtd'> ...</entity>
  #elsif &TEI.base = 'verse'
    <entity name='base.dtd'> ...</entity>
  #elsif &TEI.base = 'drama'
    <entity name='base.dtd'> ...</entity>
  #endif
  &base.dtd

and an instance might read as above, or

  <?XML dtd='tei2.dtd'>
  <?XML localdecls
  #define TEI.base 'prose'
  ?>

(A server/client distinction in the language has been suggested before,
but not extensively discussed.  One of the design goals suggests we
want to avoid such divisions, but accepting it may be better than
having a language too heavy for lightweight processors but not strong
enough for real work.  I don't know the best answer.)

-C. M. Sperberg-McQueen

Received on Friday, 4 October 1996 18:21:53 UTC