Re: Control Text-file Embedding in HTML-docs from David Woolley on 2007-03-31 (www-html@w3.org from March 2007)

From: David Woolley <david@djwhome.demon.co.uk>
Date: Sat, 31 Mar 2007 12:23:17 +0100 (BST)
To: www-html@w3.org
Message-Id: <200703311123.l2VBNHL00633@djwhome.demon.co.uk>
Tina wrote:
> On 31 Mar, sunil vanmullem wrote:
> 
> > 	<div src="foo.html"/> would pull in the HTML "fragment" from
> > foo.html
> 
>   Yes. And to make absolutely certain you also reach users with UAs or
>   settings or physical realities which do /not/ support the above
>   pulling in, you'll need to include the HTML fragment in the main
>   document.

To clarify this point.  This syntax is already defined for XHTML 2 and
the intended use is for replaceable content, i.e. <div src="foo.html" /> is
wrong because it  has no content and therefore implies that it is not
essential for the document.  Replaceable content should have a richer
(normally presentationally richer) replacement, but should carry the 
document meaning even without replacement.

> 
>   We are back to square one: build the /entire/ document on the server,
>   and send it to the client; don't rely on mechanisms on the /client/
>   for pulling in additional resources to make up a document.

The original concept of HTML was, I believe, that you used links to keep the
size of each page down, but commercial use has resulted in severe bloat, and,
as a result, there is a benefit in assembling the page on the browser, although
I doubt that many authors would bother, or they would make all the components
uncacheable, as that seems to be what people already do with things like 
cookies with every image (one of the most often asked questions amongst 
designers seems to be how to frustrate cacheing).

SGML, of which HTML is an instance, has the concept of external entities, that
would allow a document to be composed in the viewer, but HTML was supposed to
be simple, so HTML browsers didn't implement this.  I belive the same concept
exists in XML.  It also has conditional sections, although I've only seen these
used in DTDs, so I'm not sure if they can be used in the main content.

HTML is no longer simple, and one of the first "enhancements" was the img
element, which is logically a link, but physically causes the image to be
included into the rendered page.

The big problems with server side includes are:

- as a server feature, the ability to use them is often an added cost option,
  so not available on cheap hosting services, as, for example, bundled by ISPs;
- most implementations don't set a sensible Last-Modified-Date, so the page
  is never cached;
- configuring any option to set Last-Modified-Date is going to be ignored,
  as an unnecessary technicality, by most authors, even if it exists.

The problems with client side includes seem to be:

- originally HTML would not have created large enough documents to need them,
  and simple implementation was desirable, so SGML entities were dropped;
- there seems no commercial perception of a need for them by browser
  developers;
- people keep re-inventing the concept in different forms, usually trying
  to do it with syntactical level items, when it is really a lexical level
  process and a mechanism is already defined - this just causes confusion;
- cacheing can cause the parts to become incopatible versions.

>   This is a server task, not a client one, arguments so far
>   non-withstanding. We already have a problem with representing
>   graphical content with text; let's not make it any more difficult than
>   it is.

object, for the purists, is a generalisation of img, and <div src= is a
generalisation of object.  All of them are, conceptually, links, and
in non-visual browsers, e.g. Lynx, are physically rendered as links.
Received on Saturday, 31 March 2007 11:23:35 UTC