Re: Recent ERB votes from Jon Bosak on 1996-11-07 (w3c-sgml-wg@w3.org from November 1996)

From: Jon Bosak <bosak@atlantic-83.Eng.Sun.COM>
Date: Thu, 7 Nov 1996 08:17:00 -0800
To: W3C-SGML-WG@w3.org
CC: bosak@atlantic-83.Eng
Message-Id: <199611071617.IAA08349@boethius.eng.sun.com>

[Paul Prescod:]

| > Paul [Grosso] asked:
| > Frankly, this HTML compatibility thing is a total waste of time.
| > To be at all HTML comptibile, you have to cope with
| > <UL COMPACT>
| >     <li>first item
| >     <li>second item
| >     <b><li>bold item</b></li>
| >     <hr>did you know hr isn't allowd here?</hr>
| >     <li><img src=http://bet/you/need/quotes/in/sgml.gif>
| > </UL>
| > 
| > Yes, this "works" in Netscape.  Say it isn't calid all you like.
| > shout until you're blue in the face.  But reading this is what
| > HTML compatibility is about today.

No.  That's not what we're trying to accomplish.

| I think that the ERB is looking at it from the opposite point of view. They
| are trying to define XML so that an HTML-like subset of XML can be 
| "understood" by existing Web browsers.

Right, though the emphasis is really more the other way around.

The idea is *not* to grandfather in existing HTML; in fact, the spec
bars virtually all existing HTML documents just because of the end-tag
requirement.  Rather, the idea is to make it *possible* to create new
documents that are both valid HTML (according to the HTML 3.2 spec)
and valid XML.  Such documents would look a lot different from the
typical HTML page today, but they would have the advantage of working
in both the HTML and XML application environments.  So HTML users can
run a normalizer on their existing HTML document bases and get
documents that will continue to be viewable by current HTML browsers
but will also work in new XML browsers.  Without making the one
extremely limited concession to existing HTML empty elements that came
out of our final round of deliberations, this would not be possible.

Please note that there is a huge difference between a mechanism for
defining "unqualified" empty elements on an ad-hoc basis and a
mechanism that simply recognizes a small, closed set of specific
strings ("<BR>", "<IMG", etc.).  The first approach is not only far
more difficult to implement if you're starting from scratch (which is
what the implementation requirement for XML is all about), but it
allows people to go on forever using a syntax we want desperately to
get away from.  The second approach recognizes a logistical reality by
providing an absolutely minimal bridge across the gap from HTML to
SGML without putting in place anything that would encourage a practice
that we are trying to eliminate.

Note also that this strategy does not discriminate against the
existing SGML document base.  There are probably as many existing SGML
documents that will work unchanged in an XML environment as there are
HTML documents.  My Shakespeare and Religious Works collections are
valid XML just as they stand; it would be hard to point to existing
HTML collections of similar size that could make that claim.

Jon

Received on Thursday, 7 November 1996 11:19:03 UTC