Re: Recent ERB votes from Jon Bosak on 1996-11-07 (w3c-sgml-wg@w3.org from November 1996)

From: Jon Bosak <bosak@atlantic-83.Eng.Sun.COM>
Date: Thu, 7 Nov 1996 11:37:45 -0800
To: W3C-SGML-WG@w3.org
CC: bosak@atlantic-83.Eng
Message-Id: <199611071937.LAA08443@boethius.eng.sun.com>

[Paul Grosso:]

| > From: bosak@atlantic-83.Eng.Sun.COM (Jon Bosak)
| > 
| > Note also that this strategy does not discriminate against the
| > existing SGML document base.  There are probably as many existing SGML
| > documents that will work unchanged in an XML environment as there are
| > HTML documents.  My Shakespeare and Religious Works collections are
| > valid XML just as they stand
| 
| How is that?  How are empty elements represented in your existing SGML
| document base for your Shakespeare and Religious Works collections?  

There are none.

| Irrespective of the answer to the above, what SGML authoring tools 
| would produce valid XML given document instances in your Shakespeare 
| and Religious Works collections?

The SGML authoring tools that I actually did use: emacs and perl.  The
fact that these are probably not examples of what you have in mind
when you say "SGML authoring tools" says something important about the
preconceptions you may have about XML.

| Assuming neither Author/Editor nor Adept Editor (whose output, I
| believe, are very similar) are answers to the above question, what
| would be the list of modifications necessary to convert what they would
| produce given document instances in your Shakespeare and Religious
| Works collections into valid XML?

I think that Lee has addressed this.  However, the question misses the
point.  The primary use of XML is to convey structured information
from SGML databases to Web applications.  The batch processes that I
used to generate the Shakespeare and Religion collections are actually
much closer in spirit to the problem domain that XML is primarily
designed to address than anything having to do with native authoring.
The point I was trying to make (but complicated by using the word
"valid") is that if you strip off the current doctype header from the
Shakepeare and Religion files, you have well-formed XML that could (if
such things existed) be fed directly to XML browsers.  The fact that I
was writing well-formed XML in 1992 says to me that XML is "natural
SGML" -- it is what I intuitively thought that SGML was when I
started.  And consequently I believe that a lot of existing "monastic
SGML" can be made into XML much more easily than the great majority of
existing HTML documents.

I will repeat the point that I want to make sure doesn't get lost
here: the XML spec does not favor HTML legacy data over SGML legacy
data; quite the contrary.

Jon

Received on Thursday, 7 November 1996 14:39:54 UTC