Re: XML, SGML & the Web (was: Shorthand for default attributes)

Paul Prescod writes:
 > Bert Bos wrote:
 > >....
 > > If you look at a
 > > graph of the number of documents versus their size, you'll see a curve
 > > that falls off exponentially with increasing document size. This is
 > > not (only) due to the computer; it is the way people function.
 > 
 > The critical point is that we agree that there are large documents and
 > always will be large documents. I think we should be able to further
 > agree that people who edit these large documents should have the right
 > to have them be XML documents in the fullest sense: a single namespace,
 > shared entities, one DTD, one root, one hierarchy, one logical element
 > stream. We must support these documents. Thus we should not introduced
 > features that require linear scanning of documents for proper
 > processing. 

Agreed 100%.

So: we should either require all elements to be exactly 256 bytes
long, or require start tags to have a LENGTH attribute that contains
the byte-offset to the next element :-)

No, the fact is that XML is designed around variable length elements,
arranged into a tree with a variable number of children at each
node. If large documents are to be processed in random-access mode,
with deterministic response time, they have to be indexed, or stored
in a different format than XML. I'm not making it any worse by
attaching extra information to each node.

 [...]
 > >     3. XML shall be compatible with SGML.
 > > 
 > >        1.Existing SGML tools will be able to read and write XML data.
 > > 
 > >        2.XML instances are SGML documents as they are, without changes to
 > >          the instance.
 > > 
 > >        3.For any XML document, a DTD can be generated such that SGML will
 > >          produce "the same parse" as would an XML processor.
 > > 
 > >        4.XML should have essentially the same expressive power as SGML.
 > > 
 > >     Note: #1 and #2 describe our goal in its ideal form. If this goal is
 > >     not achievable in its fullest form, then we may back out to a weaker
 > >     form: it shall be simple to transform XML documents into equivalent
 > >     SGML documents, and vice versa. Our intention, however, is to bite the
 > >     bullet and ensure if we can that no transformation is needed to allow
 > >     SGML tools to read and write XML document instances.
 > > 
 > >     #3 and #4 indicate our intentions accurately, but it is not yet clear
 > >     how best to formalize and explain the phrase "the same parse", or the
 > >     phrase "essentially the same expressive power". These remain open
 > >     questions; see point 8 also.
 > > 
 > > Clearly points 1 and 2 are not met, so, according to the note, the
 > > spec should instead have a section on the recommended way to translate
 > > back and forth, with minimal loss of information.
 > 
 > That is not true. Point 2 has been met fully. Point 1 was half-met.
 > Existing SGML tools *can* read XML documents. They just cannot
 > (typically) write them without some small tweaks.

What about the "/>" delimiter? What about the "encoding="? what about
the keep-all-whitespace rule? What about the absence of !doctype?
"Almost met" still means it failed.


Bert
-- 
  Bert Bos                                ( W 3 C ) http://www.w3.org/
  http://www.w3.org/pub/WWW/People/Bos/                      INRIA/W3C
  bert@w3.org                             2004 Rt des Lucioles / BP 93
  +33 4 93 65 77 71               06902 Sophia Antipolis Cedex, France

Received on Thursday, 15 May 1997 15:56:48 UTC