Will XML go beyond SGML Users?

At 01:15 PM 9/30/96 GMT, Charles F. Goldfarb wrote:
>[Paul also said:]
>>For instance, let's say Jane Author is working in Word for Windows. The
>>document she is creating does not conform to any DTD I know of. When its
>>done, however, she wants to deliver it as XML. Why? 
>
>Not a chance! Because:
>
>>
>> * Because XML is widely supported (we hope). 
>
>RTF is available on every non-Unix system (and maybe on Unix systems too.)

RTF is VERY poorly standardized and RTF import/export is a risky proposition.

>> * Because XML is compact (we hope). 
>
>She has no idea of the size of her files.

If she's delivering them on the Web she does. If she's a half-decent
web-master she knows exactly how long it takes to download her site on a
14.4K modem. Let's call her "Jane Webmeister" if it makes her seem more
web-knowledgable.

>> * Because XML will preserve whatever structure exists in the document (and
>>Word allows quite a bit). 
>
>She doesn't know what structure is, nor does she care. If it looks right, it is
>right.

But it DOESN'T LOOK RIGHT because HTML can't allow it to look right. I have
never seen a document exported from a word processor to HTML that "looked
right." It's just too hard to map wordprocessor semantics into HTML
primitives. WordProcessor users want to be able to export their documents
and get something recognizable (as their document) out on the other end.
(the [generated HTML] seldom is)

>> * Because XML is more portable, more device-independent and more widely
>>supported than Word for Windows format.
>
>We might create an XML that is more portable and device-independent, but more
>widely supported than the flagship application of the world's largest computer
>company ... pure fantasy!

As I understood it, it was our goal that all browsers should support XML
just as they do HTML. Since even Internet Explorer cannot open either RTF or
Word files, I think that accomplishing our goal will have accomplished my
"fantasy"! HTML is already more widely supported than Word and I don't see
why XML shouldn't follow in its footsteps (though perhaps not displace it).

>> * Because XML is "open" and standardized.
>
>Not important ... she'd rather have the latest vendor goodies. (Consider the
>experience with Netscape and HTML.)

But XML _is_ the latest vendor goodies. It would play very well in the press
release war for Microsoft to announce that they support not only HTML, but
it's extensible cousin, XML which will allow intranet application deliverers
to build fine-grained object model according to their own schemas and
systems (I believe that this is Len's unburgled house), and for authors to
create their own tags to dramatically reduce the amount of markup they have
to do for a particular task.

Think of it another way: XML could be the obvious delivery platform FOR the
latest vendor goodies. XML gives Netscape the opportunity to claim that
<BLINK> is a "standard" (since an XML document could certainly include
<BLINK>). In other words, XML is a standard that gives vendors flexibility
to play their games, whereas HTML does not. For better or for worse, XML
will be a W3C carte blanche to start the tag escalation war again. Luckily
we will have a rich enough language that we can ignore that stuff and use
what we want.

>I think the sooner we abandon the fantasy that anyone but SGML users will want
>XML the sooner we can focus on designing a viable language. 

I think that if we design a language that is meant to be used by the
"traditional SGML users" it will get exactly as much vendor support as SGML.
What will have been the point? XML must go beyond the SGML niche or it will
be the last in line for vendor support behind CorelDraw graphics and RTF files.

But I think you are wrong that other people don't want it.

HTML users already want reliable fragment trasclusion and the ability to
make their own tags.

Traditional database users already want the ability to define their own
schemas. . Microsoft and W3C are trying to [move processing "smarts"] into
the document to work with the data in the document on the client side. The
simpler the document format, the simpler that is. Working with crud like the
typical HTML page would be VERY PAINFUL.

Microsoft would love to use a [single language] for the delivery of all of
their online documentation. Since they have already embraced SGML for their
CD-ROM products, and the gap between Windows Help and HTML is large, XML
seems the obvious choice.

>Here is my
>reasoning:
>
>1. No vendor with a dominant market share will voluntarily adopt a standard
that
>will open his market to competitors. 

Luckily, at this point Microsoft does _not_ have a dominant market share,
and they have proved themselves willing to adopt a "standards pose" in order
to play the good guy. They have even mentioned the acronym SGML in a [major
press release] in that vein. Since they are already strolling down the
[stylesheet street], I don't see why they wouldn't take an XML detour once
we explain its benefits.

>2. No users will press a vendor for XML support except those who already
>understand the value of a structured,  renditionless information representation
>-- in other words, SGML users.

Not true. SGML has always been easy to sell. Almost everyone who hears about
it understands its value. It's the cost of using it that traditionally
scares them off. Delivering SGML documents to Web clients is _expensive_ and
_cumbersome_. If we can make it cheap and easy, I don't know why anyone
would voluntarily limit themselves to a subset.

>3. They won't get that support unless we make XML absurdly easy to implement.
>That means a delivery form of an SGML instance with no tricky parsing,no SGML
>declaration, and no DTD.

Agreed.

 Paul Prescod

[move processing "smarts"] into the client
    [W3C Activity Page] http://www.w3.org/pub/WWW/OOP/Activity 
    [More Juice For Web Pages] http://www.news.com/News/Item/0,4,3562,00.html
    [Microsoft HTML Obj. Model]
http://www.microsoft.com/corpinfo/press/1996/sept96/htmlpr.htm

[generated HTML]
    [Stroud Reviews HTML editors]http://www.icorp.net/stroud/shtml.html
      (note that word processors as HTML editors consistently score at the
bottom)
 
[major press release] http://www.microsoft.com/internet/html.htm

[stylesheet street] http://www.microsoft.com/ie/most/howto/htmlenh.htm

[single language]
    http://www.microsoft.com/corpinfo/press/1996/sept96/JUMPSTpr.htm

Received on Monday, 30 September 1996 11:36:01 UTC