Re: In-line HTML files?

Paul Prescod (papresco@calum.csclub.uwaterloo.ca)
Thu, 20 Jul 1995 08:52:11 -0400


Date: Thu, 20 Jul 1995 08:52:11 -0400
Message-Id: <199507201252.IAA21729@calum.csclub.uwaterloo.ca>
To: www-html@www10.w3.org
From: papresco@calum.csclub.uwaterloo.ca (Paul Prescod)
Subject: Re: In-line HTML files?

At 12:50 AM 7/20/95 -0400, Ka-Ping Yee wrote:
>To reference files on other servers would not be impossible -- it's just that
>if we're going to do it, we might as well properly hash out a way to
>reference it in HTML instead of doing hacks to be parsed by http daemons.

The important thing to recognize is that SGML has already worked out all of
the symantics for "including" text from a document fragment into another
document.  SGML External entities are well defined and well document.  At
first glance it seems that the simple answer would be:

<!ENTITY foo SYSTEM "http://www.foo.com/fragment.html" >

...

&foo;

Of course this would bring the browser vendors one step closer to using real
SGML-based parsers.  This goes against what seems to me to be one of the
unspoken mandates of the HTML WG: to allow browser vendors to get away with
writing ad-hoc parsers, and to force SGML tools to conform to the resultant
chaos.

>The problem with including arbitrary bits of HTML is that you have no
>guarantee other documents will be conforming documents.  In fact, it would
>not be possible to directly include another conforming document (with
>another <head> and <body>) within a conforming document.  Even two
>fragments of correct HTML can cause havoc when one is included within
>the other (to wit, imagine nested <A>s or <FIG>s).

How is this a problem?  How is it any different if I do a server-include or
CPP or write a Perl script that combines the documents?  The only difference
is that someone else may control the linked data. 

Whenever you link to something elsewhere you run the risk of the linked data
changing.  But if someone changes the markup at their site and it renders my
document invalid, I will be at worst embarrased. A much worse situation can
occur with a simple IMG.  If you link to a picture, the maintainer can
change it to something illegal or obscene.  If you choose to include someone
elses text or image in your home page, incorrect markup is the _least_ of
your worries. =)

HTML 3.0 handles this with the MD attribute and perhaps the entity reference
syntax should be extended to also allow a checksum.  

>This is why i believe included documents should be treated in a similar
>fashion as figures: they are set apart in their own box (that is,
>logically speaking -- the box doesn't have to be physically visible)
>and continue to be treated as separate documents.

You are suggesting a mechanism for implementing full subdocuments.  I am
suggesting a mechanism for implementing document fragments.  Both are
necessary.  

 Paul Prescod

----------------------------------------------------------
HTML Myths Page: http://www.incontext.ca/~papresco/htmlmyth