[Prev][Next][Index][Thread]

Re: SERIOUS concerns about implementation



At 12:51 PM 2/20/97, lee@sq.com wrote:
        *     *     *     *     *
>No-one has ever said that existing legacy SGML (or HTML) files will be
>valid XML files.  They won't.
        *     *     *     *     *
>No, none of these files will break.  They are all SGML files, and
>are not for use with XML.  I expect that a future Panorama may well
>handle both XML and SGML...
        *     *     *     *     *
>I agree.  All the changes were made for good reason.
>XML is a subset of SGML, but that doesn't mean that a subset of existing
>SGML files are XML files.

There probably are existing SGML files that are XML files.  Any XML file
that includes its DOCTYPE declaration is both.

At 2:38 PM 2/21/97, Len Bullard wrote:
        *     *     *     *     *
>Just quickly glancing, XML isn't an "application of SGML".
>I think it is a proper subset.  HTML is an application of
>SGML, and could be an application of XML.   That is, perhaps,
>one way to get the point across.

We're losing sight of the subset/application spectrum.

HTML (at least at any point in time) is intended to be a single, nonextendable
markup language.  Nonextendable in the sense that making any additions forms
a new markup language, which is only (hopefully closely) related to the old.

SGML at any point in time is intended to be extensible in certain ways.
There are specific areas where choices may or must be made.  These choices
partition naturally into three groups:  "Semantic" choices, which are
generally of interest only to application designers and users; "structure"
choices that are specified via a document type declaration, and "lexical"
choices that are specified via an SGML declaration.

At least by tradition you must make the lexical choices before you make the
structure choices and semantic choices.  Making all three is in SGML terms
"defining an application".  Additional restrictions that cannot be specified
via the document type declaration or the SGML declaration are "application
conventions".  HTML (at least in some incarnations) is an application of
SGML by this definition.

XML is in the in-between grey area.  We plan to lock in the SGML declaration;
we plan to lock in some semantic roles (e.g., for linking); we plan to add
additional restrictions that cannot be specified via the document type
declaration or the SGML declaration.  But we allow users to complete the
process of making an SGML application.  It seems to me that "subset" is
an appropriate term for such a set of some-but-not-all of the decisions
necessary to define an "application".

According to this definition of "subset", an application is a subset in
which all of the available decisions have been made.  (Or almost all--one
could presumably still add more more-restrictive "application conventions".)

I pose three questions:  Is this "subset" a useful concept?  Is "subset"
the right term?  And is this what you all have meant by "subset" when
you've used the term?

Dave Peterson
SGMLWorks!

davep@acm.org



Follow-Ups: