RE: SD1 - Short End Tags from Jon Bosak on 1997-05-17 (w3c-sgml-wg@w3.org from May 1997)

From: Jon Bosak <bosak@atlantic-83.Eng.Sun.COM>
Date: Sat, 17 May 1997 02:05:44 -0700
To: w3c-sgml-wg@w3.org
Message-Id: <199705170905.CAA16101@boethius.eng.sun.com>

[Andrew Layman:]

| I'm getting a lot of pushback from database people regarding this
| point. They are very concerned that we make it possible for them to be
| more economical in their encoding. Accomodating their needs means
| opening up a whole additional category of XML user.

I forgot one important category of existing user who would be
inconvenienced by short tags if that were the only form allowed (this
is relevant because I think that there shouldn't be options in this
case).  This is every current large-scale SGML user whose tool (like
Adept) normalizes files when saving them by providing explicit end
tags.  There are a few hundred million pages of such stuff out there,
and I don't want to make this the illegal form.  (Yes, they have to
filter for the new empty-tag syntax anyway, but this is an additional
annoyance for tools vendors.)  Note that no one ever complains about
the extra space this normalization requires; on the contrary, most
people consider this a feature.  I do.

The larger point is what gets traded for what is gained.  I'm just not
seeing the big savings in byte count that motivate this proposal.
Simply adopting Unicode roughly doubles file sizes, and adopting UTF-8
as the default encoding adds roughly another 50 percent for the Asian
users; I'm not hearing anyone complaining about these tradeoffs.  Drop
one extra GIF image onto a page and you've wiped out any conceivable
reduction in size gained by short tags.  And in the pure database
examples, more space would have been saved by a smarter choice of GIs
than by leaving them out of the end tags.

I'm still trying to maintain an open mind on this one, but it's
reminding me so much of all the articles written in the 1970s on how
to save space in BASIC programs by leaving out the spaces between
tokens that I'm having a difficult time shaking off the feeling that
I've seen this particular movie already.

Jon

Received on Saturday, 17 May 1997 05:06:14 UTC