[Prev][Next][Index][Thread]

Re: equivalent power in SGML and XML



On Mon, 16 Sep 1996 21:13:26 -0400, "Eve L. Maler" <elm@arbortext.com> wrote:

>[Thanks, Charles, for correcting me on the ESIS/EMPTY issue...]
My pleasure ... now I'll try to show why that means that EMPTY is unnecessary.
>
>Michael -- compared to my poor attempt, your mail seemed crystal clear
>to me!  You've explained why EMPTY isn't just another random kind of 
>declared content.
He's tried to, but I think he is wrong.
<snip>
>>
>>Dropping EMPTY, on the other hand, would seriously affect the expressive
>>power of XML DTDs, since we could not guarantee the second
>>DTD-equivalence clause above.
I don't think so. The effective normalized (i.e., ESIS or semantic grove form)
of all empty elements, whether declared or spontaneously empty, is:
"<tag></tag>"
Therefore, a content model of (#PCDATA) is equivalent to a declared value of
EMPTY.
(There is a difference in validatability by an SGML parser, but that is not an
issue here ... and isn't much of an issue anywhere.)
>>
>>
>>III What is needed?
>>
>>I think we should shoot for the following:
>>
>> - any existing SGML document can be translated into an EE-ESIS
>>equivalent document (i.e. same ESIS, same gross entity structure, but
>>there may be more, or fewer, internal entities and references)
>> - any existing SGML DTD can be translated into an XML DTD which
>>recognizes a set of documents EE-ESIS-equivalent to the original DTD.
>> - a document translated from SGML into XML into SGML should be
>>EE-ESIS-equivalent to the original; same for one that goes X-S-X.

Eliminating declared content (including EMPTY) and mixed content is all that is
needed to accomplish this goal. Translating from the SGML instance with declared
and mixed content to the XML without those things is trivial and is reversible
without information loss. The resulting XML can be parsed (but not validated, of
course) without reference to a DTD.

This solution does not in any way dictate the syntax of XML DTDs. What it does
do  is remove the need for new markup conventions or syntax in XML instances, or
the need to require XML applications to enforce conventions for RE handling.

This solution also has the significant advantage, described by Paul Grosso, in
that existing SGML tools could easily create and receive XML documents, and that
essentially all existing document types could easily be used for XML.

And yes, Eve, I think you are right in thinking this technique would work for
RANK, and possibly even DATATAG as well.

Best regards,

Charles

--
Charles F. Goldfarb * Information Management Consulting * +1(408)867-5553
           13075 Paramount Drive * Saratoga CA 95070 * USA
  International Standards Editor * ISO 8879 SGML * ISO/IEC 10744 HyTime
 Prentice-Hall Series Editor * CFG Series on Open Information Management
--


References: