HTML 4.01 as an SGML application

The following message is a courtesy copy of an article
that has been posted to comp.text.sgml as well.

(Copy to the W3C HTML specification discussion)

roconnor@math.berkeley.edu (Russell O'Connor) writes in
comp.text.sgml, citing Peter Flynn, concerning the nature of the
relationship between HTML 4 and SGML:

> >Elsewhere, "application" can refer to executable software (as in "nsgmls
> >is a conforming application of SGML").
> 
> Isn't nsgmls an SGML system as in ``An SGML System Conforming to
> International Standard ISO 8879 -- Standard Generalized
> Markup Language'' (from <http://www.jclark.com/sp/index.htm>).

I think Peter's use of "elsewhere" means "outside of the world of
SGML terminology".

But, yes, SP/nsgmls is an SGML system, not an SGML application.

> I intended to use the word function in the mathematical sense.
> Actually perhaps using the word ``set'' would be better.  Here I was
> identifying the set with it's characteristic function.  Then I would
> want to say that this set has certain closure properties.

Does this refer to the question of whether the formal definition of an
SGML application in the language of SGML may be supplemented with
other means of semantic specification or with restrictions stated
outside of formal SGML language that pertain to application handling?

The answer to that is YES; see the formal definition of "SGML
application" in ISO 8879 at 4.279.

> This is in reference to the beginning of my post.  There are several
> choices (alternatives) one has when making a Document Type
> Declaration.  One can use a PUBLIC FPI, one can use a SYSTEM
> identifier, one can paste your DTD right into the declaration, etc.

The HTML 4 specification does not permit this range of choice.  It
would be very adverse for bandwidth were an author to ship a fully
assembled "HTML 4.01" document -- as understood in the sense of its
associated SGML application (including "HTML4.dcl", "strict.dtd",
"ISOlat1.ent", "ISOsymbol.ent", "ISOspecial.ent" in addition to the
content of its "HTML" element) -- through either HTTP or SMTP, that
apart from the question of whether popular user agents would grok it.

> Should I not expect HTML tools to accept all documents belonging to
> the HTML 4.0 application?

Not if the SGML application definition is understood to involve only
that which is expressed in SGML language, ignoring ISO 8879 at 4.279.

>                        And if they don't then the software is broken
> and doesn't conform the the HTML 4.0 specification.

No, it is not broken.  I think, however, there is confusion in the
community because an HTML document can pass successfully through a
general purpose SGML validator, configured for HTML 4.01, and yet not
meet formal technical requirements of the HTML 4.01 specification.
(As Terje Bless said earlier in www-html, the SGML declaration portion
of the SGML language specification could be tightened.)

> >I think the W3C was on acid when it specified that it's only a HTML
> >application if you use one of their document type declarations, but
> >strictly speaking they are at liberty to say so if they wish. It
> >doesn't stop the above being valid HTML (although non-functional in
> >a browser), but it stops you calling the above "W3C HTML application
> >documents" (for want of a better phrase).
> 
> *l* now I'm totally confused.  I guess I'd like to know what you
> believe is and isn't valid HTML 4.0, and what you think is and isn't
> valid W3C HTML 4.0 application documents, and how on earth it is
> possible they don't mean the same thing.

It might be clearer if the HTML 4.01 specification, rather than saying
simply that HTML 4.01 is an SGML application (using the recipe of ISO
8879 at 15.5.1 and relying heavily on the definition of SGML
application at 4.279), in section 4 "Conformance" said something like
the following:

---

  HTML 4.01 defines three SGML applications conforming to
  International Standard ISO 8879:1986 (WWW) -- Standard Generalized
  Markup Language.  Under this specification HTML 4.01 documents must
  comply with certain requirements, not stated in formal SGML language
  as provided in the said ISO standard, that serve to ensure the
  suitability for sharing of an HTML document (as only a part of the
  corresponding formal SGML document instance).

---

                                    -- Bill

P.S.  Shouldn't section 4.3 of the HTML 4.01 specification cover
SMTP as well as HTTP transmission?

Received on Tuesday, 19 November 2002 13:57:40 UTC