<insert> and external entity references

C. M. Sperberg-McQueen (cmsmcq@uic.edu)
Tue, 19 Mar 96 11:23:06 CST

Message-Id: <9603191923.AA17113@www10.w3.org>
Date:         Tue, 19 Mar 96 11:23:06 CST
From: "C. M. Sperberg-McQueen" <cmsmcq@uic.edu>
Subject:      <insert> and external entity references
To: www-html@w3.org

Excuse me if someone explained this while I was not paying attention,
but why are we talking about adding an INSERT tag with the semantics 'go
find this file or document, and insert it here', when SGML already has
the mechanisms needed for this, in the form of entity references?  Why
not just start writing, requesting, or demanding HTTP servers that
actually understand and process references to external entities
as defined by ISO 8879?

To illustrate, for those not yet conversant with all of SGML:  for
this ...

> Now, my serious question is, at this time could one simply use
> <insert
>         data="http://www.mysite.com/path/file.html"
> >
> </insert>

a normal SGML syntax would be this:

1 in the document type definition,

  <!DOCTYPE html PUBLIC "-//IETF//DTD HTML 2.0//EN" [
  <!ENTITY myfile SYSTEM "http://www.mysite.com/path/file.html" >

2 in the text itself,


As has been pointed out, this depends on having an SGML parser
which supports URLs as system identifiers.  For local documents,
it is equally easy to use a normal file id:

  <!ENTITY myfile SYSTEM "/usr/me/public_html/file.html">

If it is desired (as some have proposed) that the external entity
be parsed as a completely independent object, the required variation
in the syntax is again already provided by SGML:  just declare
the external entity as a SUBDOC (i.e. a free-standing document, to
be parsed on its own, not as part of the current document).

 <!ENTITY myfile SYSTEM "/usr/me/public_html/file.html" SUBDOC>

N.B. not all SGML software supports the SUBDOC feature, just as
not all SGML software understands URLs, which are not after all
defined by ISO.  That shouldn't make too much difference, I think:
we are talking about a change to HTTP and/or HTML, and that means
rewriting at least some software.

Is there an advantage to inventing a new notation for inclusion
of documents and document fragments, rather than using the
existing notation?  Or is it just not widely known that notation
for such inclusion already exists and need only be adopted, instead
of being invented?

-C. M. Sperberg-McQueen
 Computer Center, Database Group
 University of Illinois at Chicago

All opinions expressed in this note (except those I have quoted with a
view to refuting them) are mine.  They are not necessarily those of the
University of Illinois, its administration or Board of Regents, nor of
the Text Encoding Initiative, its executive committee or other
participants, its sponsors, or its funders.  Anyone who says otherwise
is wrong.