Re: C.7 inclusions and exclusions? from lee@sq.com on 1996-10-18 (w3c-sgml-wg@w3.org from October 1996)

From: <lee@sq.com>
Date: Thu, 17 Oct 96 23:47:54 EDT
To: dgd@cs.bu.edu, w3c-sgml-wg@w3.org
Message-Id: <9610180347.AA25783@sqrex.sq.com>
> What I'd really like is a "one-level" inclusion so that I could declare an
> element and say that it should be included anywhere in the immediate
> content of another element.

I agree strongly.  David, I am not picking on you in what follows -- I
agree with all of your remarks, I think....

> I _know_ we're not going to do that one.

Yeah, it would be too useful :-)  But the way things are going, XML isn't
going to be used very widely anyway.  It's accumulating too many of the
things that, in practice, have made SGML unpopular with those people whose
favour we now seek to curry.  Or so it seems to me.

We have multiple syntaxes for representing structure, "justified"  by
nonsense about what is or isn't a document; if your language isn't good
enough to represent non-document objects, fix it, don't switch to another
language mid-stream.

We have arcane whitespace rules (well, it comes and goes).

We have content models that are not standard regular expressions --
if we end up with only or groups, though, that's not so bad.
Especially if we lose inclusions, exclusions and "&".

We have comments that "don't work properly".

We have (or am I wrong here?) marked sections and verbatim elements,
in which the parsing rulles suddenly change, and which are delimited
by a strange syntax.

It's idiosyncratic.  It's irregular.  It's ugly.  It needs more work.
Well, that's OK, that's what we're here for.
But let's not lose sight of the goal of wide deployment.

Someone very close to the development of SGML once told me that a reason
not to simplify SGML by allowing end tags on EMPTY elements was that it
would allow "amateurs" to write SGML parsers.  But the SGML language
itself was designed by people who were experts in their own fields,
but amateur language designers, as it seems to me.  (I know I don't know
them all, so if there were some domain experts in parser theory on the
committee who weren't sufficiently influential, sorry.)

Let's not repeat that mistake.  Listen to the experts when they speak.

There are good reasons for terms like LALR and LR(1).... :-)

Few drivers want a car with two brake pedals, one for speeds of less than
45 MPh and one for greater speeds, nor a separate steering wheel for
turning left when in third gear going downhill.

Different road conditions do not merit different kinds of controls.

Perhaps this is just my frustrations, in which case I apologise for
wasting your time.  Maybe other people don't feel the way I do.

Imagine if the Unix C compiler output its parse tree in SGML, and so did
the Pascal compiler, sharing the same code generation algorithm.  Imagine
if they output SGML to the assembler, so that you could insert an
SGML-based code optimiser.  Imagine if you could use the SGML-awk to write
a little peephole optimiser (in the sense of Thomson & Ritchie).
Or monitor the stream going through the compiler to generate a
cross-reference code listing.

What if database schemas were written in SGML?  Spreadsheets?

But right now, even XML is dangerously arcane.

Take a big step back and imagine (if you will) that you had never
seen SGML, and approached the XML specification.  Would it seem easier
to you to adopt it unchanged, to adopt a subst of it that suited youm,
or to go off and do your own thing untrammelled by its complexities?

Compared to a full SGML parser, maybe it's a breeze.  Compared to
learning to recite the entire Vedic literatire in sanskrit, writing
an XML parser is also going to be easy, I suppose :-)

Sorry for a long rant.  But I really do think we need to keep reminding
ourselves of this stuff.

Lee
Received on Thursday, 17 October 1996 23:48:20 UTC