Re: RS/RE quoted data proposal

> An SGML-editor friendly one like this:
> <P><C>abcdefgh</C><EM><C>ijklmnopq</C></EM></P>
> and a a typist-friendly alias like this:
> <P>"abcdefgh"<EM>"ijklmnopq"</EM></P>

The latter syntax is certainly not typist friendly.
What happens if you omit a quote?

If you're going to do that, use
@p{abcdefgh@em{ijklmopq}} instead -- can also be done with SGML.

But I think this is all besides the point.
If you allow mixed content, and say (SGML notwithstanding) that
any amount of whitespace is equivalent to a single space, in all
circumstances, you get something that can be handled with no DTD,
with no need to distinguish element & text contexts, and that is
easy to explain.

Although the "..." proposal is a very clever use of relatively obscure
SGML, I don't fancy explaining it, and when you can include newlines in
quoted `strings'', and why the emphasised text in the example is actually
embedded in the text it surrounds, conceptually, even though the syntax
makes this impossibly obscure.

A language needs to be
* consistent & regular
* easy to explain
* clearly and unambiguously defined.

It is a fact of history that SGML does not meet these goals, but any new
system for which wide deployment is desired must do so.

Change the whitespace rules.  Accept the idea of a mechanical translation
between HTML and XML, and between SGML and XML, and that if you go from a
richer format to a weaker one you may lose information.  Then you'll get
something that can actually be used.

If you have to accept all currently legal HTML syntax (as some have opined),
or if you are encumbered by needless complexities that are in the more
powerful language because of features not used in XML, you'll end up with
something irregular and complex.


<P>this is a <em>simpler</em> approach.</P>