Re: Missing Functionality: Include from Aryeh Gregor on 2009-05-11 (www-html@w3.org from May 2009)

From: Aryeh Gregor <Simetrical+w3c@gmail.com>
Date: Sun, 10 May 2009 21:36:51 -0400
To: Elliot Jenner <void2258@gmail.com>
Cc: www-html@w3.org
Message-ID: <7c2a12e20905101836q17bba31fyc5347a883af5371a@mail.gmail.com>
On Sun, May 10, 2009 at 7:37 PM, Elliot Jenner <void2258@gmail.com> wrote:
> Virtually the sole purpose of the link element is to include CSS. Since it
> is almost never used for anything else, I consider it part of CSS for all
> intents and purposes.

Have you read the HTML spec?  <link> has a great many uses.  For
instance, in the HTML for the Gmail page I'm on right now, I find:

<link rel="shortcut icon" href="/mail/images/favicon.ico" type="image/x-icon">
<link rel="alternate" type="application/atom+xml" title="Gmail Atom
Feed" href="feed/atom" />

I'm not sure if stylesheets are even the single most common use of
<link>.  <style> is probably used as often, and feeds and favicons are
very common.  The more esoteric links like "next" and whatnot are used
by some pretty major websites too (e.g., Wikipedia).

> What I am suggesting for a future version (and hopefully a backport) is a
> simple tag command, say <include href="file" /> that takes the contents of
> the file in question and places it into the position it occupies, similar to
> what a <link href="file" type="text/css" /> does now for CSS, but without
> any type of formatting. In other words the parser reaches it, jumps to the
> other file, reads the HTML as it, then jumps back when the file ends and
> continues. This could also be used to import scripts that are duplicated as
> well. The type requirement is not as practical in this instance and could be
> omitted entirely if it is problematic; users advanced enough to be inserting
> scripts should be able to put enough in the include files to make it work on
> just a raw copy anyway.

Obviously the page would have to block while retrieving the <link>, so
latency would be higher for the first request.  Other than that, I
think this serves a possibly important use-case: compactness of
markup.  Minimizing the length of markup has the potential to reduce
both bandwidth costs (on both ends) and latency on requests other than
the first.  This would serve a similar purpose to Google's SDCH:

http://sdch.googlegroups.com/web/Shared_Dictionary_Compression_over_HTTP.pdf

Most pages have very repetitive headers and so on.  If these could all
be compressed into a single <link> tag that's only served once, that
might be good for performance.  Of course, the HTML documented that's
inserted to replace the <link> tag would follow normal caching rules
-- for this to be very useful, it would have to be served with lengthy
Expires headers.  Otherwise you would increase latency on every
request just to get a 304 response.

I don't think this would be a great idea solely for the stated reason,
by the way, because of that.  It's all very well to include another
file just for tidiness when that file resides on the local disk, or
hopefully even in RAM, where retrieving it is maybe even (in the case
of compiled languages) a one-time cost.  It's quite another thing to
go and fetch lots of includes when they're all the way on the other
end of the Internet.  Includes in HTML would need to be used with
care, and only if there's a clear benefit.

I also don't think <link> is appropriate for this.  <link> indicates
relationships between documents, it's not a processing instruction.  A
name like <include> might be better.

A few further thoughts:

1) The included content (which is probably critical to the page's
meaning) would completely fail to render for legacy user-agents.
Graceful fallback would be difficult, since the whole point is to
avoid including extra code.  Of course, authors could just wait for
all important browsers to support the feature, or do some
autodetection magic and serve different pages.

2) The security implications would be more severe than seamless
iframes, etc.  It would have to be restricted to same-origin.

3) On the first request, all parsing would have to stop when the
include is encountered.  This might not be much worse than <script>
tags, in practice -- but <script> tags would likely be among the
things commonly put *in* these includes, so it could be compounded.
(By contrast, <script>s usually don't attempt to include other
<script>s, at least if well-written.)

4) The way that fast loading of a page that's almost the same as the
current one is currently implemented is generally AJAX.  Allowing
sites to avoid JavaScript for this would be a good thing.

5) Presumably JavaScript could add includes when the user clicks a
button to open up a reply form, view more comments, etc.  Currently
there are only two ways to do this: a) Code the needed HTML into the
page source or a script include.  This increases the weight of every
page load, or at least the first, even though the overwhelming
majority of viewers might not be interested in the functionality.  b)
Fetch the needed HTML dynamically when needed.  But in this case, it's
not cached, unless it's specifically added to localStorage.  Including
content with a proper Expires header would allow the content to be
cached transparently by the browser as space permits, simplifying work
for authors and making more efficient use of disk space (since the
browser can clear its own cache freely, but not localStorage).


This seems like an interesting proposal.  Perhaps it should be
considered for inclusion into HTML5, or at least HTML6.  It must have
been suggested before at some point, though . . .
Received on Monday, 11 May 2009 01:37:30 UTC