Re: XML vrs SGML tools [was Re: Capitalizing on HTML (was ...)]

On Fri, 20 Sep 1996 09:16:50 -0400, Paul Prescod
<papresco@calum.csclub.uwaterloo.ca> wrote:

>At 02:17 AM 9/20/96 GMT, Charles F. Goldfarb wrote:
>>It seems like the two Pauls are agreeing that 
>I can only speak for me. =)
>>1. XML has to be easy for SGML editors (and word-processor add-ons) to generate
>>and for  arbitrary browsers to read. 
>Right. That's why I'm worried about this RS/RE correction proposal. I don't
>think today's SGML editors will support it. Otherwise, the proposal is quite
>elegant and simple.
They don't actually have to. Any SP-based hacking tool (and probably Omnimark as
well) could perform the SGML->XML transform, and the reverse transform as well.

>>We needn't worry about XML-only editors or
>>dedicated XML browsers because there won't be any. 
>I would word this a little differently: "We can't _depend_ on XML-only
>editors or dedicated XML browsers because there MAY NOT be any." (especially
>in the short term)
I stand corrected. But I think the implication for the XML design is the same. 

For myself (not paraphrasing two Pauls, as before), I am troubled by the idea of
no XML browsers because I thought that the purpose of XML was to encourage tool
development. If we don't expect any XML tools, why design XML? My personal
belief is that if we do XML well, there will be XML browsers. I don't believe
we'll see any dedicated XML editors any time soon, however; just XML modes in
SGML editors and trivial transform tools for SGML<->XML. But that is not
necessarily a bad thing, given the objectives of XML.
>>2. The browsers need to be able to work without accessing a DTD.
To summarize my proposal, then, insofar as it affects the document instance.

1. Delimit pseudo-elements.
2. Supply end-tags for empty elements.
3. Allow CDATA and RCDATA marked sections.

An XML browser can handle all this with just a simple stack to match ends of
elements and marked sections with their starts. We can decide whether to allow
comment declarations, PIs, SHORTTAG,  ignored marked sections, inclusions, etc.
on their own merits, as they won't affect the basic parse.

And this solution is valid 8879 because  an arbitrary DTD (including one with
mixed content and declared content) can trivially (and reversibly) be
transformed to an equivalent DTD that supports the above XML instance

Charles F. Goldfarb * Information Management Consulting * +1(408)867-5553
           13075 Paramount Drive * Saratoga CA 95070 * USA
  International Standards Editor * ISO 8879 SGML * ISO/IEC 10744 HyTime
 Prentice-Hall Series Editor * CFG Series on Open Information Management