Re: XML and required DTDs

On Tue, 17 Sep 1996 10:11:54 -0400, "Steven J. DeRose" <sjd@ebt.com> wrote:

>At 01:02 AM 09/17/96 GMT, Charles F. Goldfarb wrote:
>>On Mon, 16 Sep 1996 16:52:44 -0400, "Steven J. DeRose" <sjd@ebt.com> wrote:
>>>The only things blocking you from parsing one-entity minimal SGML document
>>>without a DTD are:
>>>   a) Declared content (CDATA/RCDATA/EMPTY elements)
>>>   b) The RE-ignoring rules.
>>Exactly! Just eliminate declared content and mixed content and you've
>solved the
>>problem. We don't need those constructs for XML, they are just forms of markup
>>minimization parading under other names.
>I don't see yet what mixed content has to do with it. 
As James Clark pointed out in an earlier posting, eliminating mixed content is
the only way to avoid the whitespace treatment problem.

>I do see need for keeping mixed content; it's just so messy to have to tag every
>pseudo-element as a real element. 
It doesn't have to be that way. With a few fixed shortrefs, it can be no worse
than some of the  proposals for enriching tag delimiters. For example, all of
the following three equivalent alternatives can trivially be produced from
instances of arbitrary DTDs and trivially transformed without loss between SGML
and XML.

<p>/This is the 1st pseudo-element /
/and this is another./</p> 

<p>"This is the 1st pseudo-element "
"and this is another."</p>

<p>[This is the 1st pseudo-element ]
[and this is another.]</p>

>The only difficulties with mixed content
>are the whitespace treatment we've already decided has problems, 
>and a few cases of OMITTAG that will likely also go away in XML, right? Is there some
>problem of mixed content that we aren't going to address another way already?

Unfortunately, the existing proposals for handling whitespace don't  work (see
earlier postings from myself and James Clark). There are only two solutions that

1. RE/RS handling a la SGML (or, preferably, a la the proposals for SGML97,
which simplify things considerably); or

2. Explicit tagging of pseudo-elements (as you so nicely put it; this phrase
conveys the idea much better since we are not actually eliminating mixed content
structurally, we are just tagging the pseudo-elements).

Best regards,


Charles F. Goldfarb * Information Management Consulting * +1(408)867-5553
           13075 Paramount Drive * Saratoga CA 95070 * USA
  International Standards Editor * ISO 8879 SGML * ISO/IEC 10744 HyTime
 Prentice-Hall Series Editor * CFG Series on Open Information Management