RE: SD1 - Short End Tags from Michael Sperberg-McQueen on 1997-05-21 (w3c-sgml-wg@w3.org from May 1997)

From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
Date: Wed, 21 May 97 07:37:57 CDT
To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
Message-Id: <199705211250.IAA15083@www10.w3.org>

On Mon, 19 May 1997 13:20:43 -0400 (EDT) Andrew Layman said:
>Several people have suggested that we could save as much space as short
>end tags offer by putting data into attributes rather than elements.
>One could turn this around and say that if we had short end tags, we
>would be much less motivated to put data into attributes!

Yes; some of us would regard that as a real loss, of course ...

>What this points out is the urgency of settling the question of whether
>there is a real distinction between contents and attributes. Once we
>know the answer to that, a great deal follows. I hope to post a large
>essay on that point later today.

One critical difference is simple:  the sequence of attribute-value
specifications in a start-tag carries (by definition, I believe; if not,
then by convention) no information, and the values of attributes are
(by definition, as far as 8879 and XML-lang can manage it) atomic
in the usual dbms sense.  Subelements, by contrast, have a sequence
which is normally significant (you have to take special steps, outside
the boundaries covered by 8879, to indicate when the sequence of
subelements is *not* information), and elements are -- as Andrew
Layman well knows -- potentially structured.

The analogy between the attributes of an SGML element and the
attributes / columns / fields of a relational table (atomic values,
with their sequence non-significant by definition) seems to me
very suggestive, and seems to argue that dumping the columns of
a relational table as attributes of an empty element would be
a very natural method of representing them.

On the general question - we now have modems that do data compression
on the fly (we have had for almost ten years now), and Bert Bos is
assuring all of us that the network bandwidth problems of today
will be only a hazy memory in a very short time.  The byte savings
of empty end-tags seem to me negligible, and I can't believe the
DB people are actually motivated by them, even if they say and think
they are.  They may have an unspoken sense that these bytes just
aren't carrying their weight and could be dropped without harm.

If byte count is the issue, use short names and attributes, and
compression.

If an aversion to redundancy is the issue, then I'm with Jon on
this one:  this is a redundancy tradeoff that makes casual
XML processing so much simpler that it's well worth while, even
if it isn't to everyone's taste.

Just Say No to empty end-tags.

-C. M. Sperberg-McQueen

Received on Wednesday, 21 May 1997 08:50:37 UTC