XML Information Set Requirements, W3C Note 18-February-1999 from Clark Evans on 1999-02-19 (www-xml-infoset-comments@w3.org from January to March 1999)

From: Clark Evans <clark.evans@manhattanproject.com>
Date: Fri, 19 Feb 1999 03:08:49 +0000
To: www-xml-infoset-comments@w3.org
CC: david@megginson.com, xml-dev@ic.ac.uk
Message-ID: <36CCD5C1.33353EA6@manhattanproject.com>
> Editors: David Megginson(david@megginson.com) 

Thank you David.  It looks like you did a great job
making the document clear.   I only have one (long) 
comment that I have made several times on xml-dev.
Sorry for those who have heard my soap box before. 

:) Clark Evans

> Abstract:
>
> This document lists the design principles and requirements for
> the XML Information Set, a meta-model for XML documents

Stating that XML is a *document* standard effectively 
precludes this XML standard from a very useful 
application as stream markup.  This is a very subtle 
bias which, in my opinion, will severly damage the XML 
standard if it continues.

Now.. I *do* like the word "infoset" beacuse 
it is much more illustrative of what I see the
goal as being: A way to represent document subsets
or stream fragements.  This view of an "infoset"
will reap hudge rewards, the implicit pre-requesite
of the entire document being present before
the infoset has value being removed.

Suggestion:  Strike "document" and replace 
with "document subset".  Where "subset" includes
not only proper subsets, but also the possibility 
of the infoset representing the entire document.

In this way, you reserve the much more powerful 
ability to focus on the XML as a stream and
breaking it into manageable chunks based upon
the needs of the processing tool.  Picture a 
stack-based mechinism, where the "smallest"
document fragement which can satisfy the 
needs of the transformation or process in 
question is kept in a multi-pass storage,
allowing the remainder of the information
to be handled using a single-pass mechanism.

It is the ballence between the two styles
that generates power.  Too much one way or
the other way will lead to inneficient systems.

> The XML Information Set will be purely descriptive: it will
> identify a common set of abstract XML information without
> mandating a single type of processing behaviour or a specific                           
> API for XML-based software.

Good.  So it will *not* require the entire document
to be available in a multi-pass storage mechinism?

> 2. Design Principles
> 
>  1.The XML Information Set shall provide an abstract model
>      for describing the logical structure of a well-formed XML
>      1.0 document (note that all valid XML 1.0 documents are
>      also, by definition, well-formed).

Does this provide an abstract model for describing the logical
structure of a well-formed document SUBSET ?  

>  5.The XML Information Set shall be designed to be
>      interoperable with the W3C's DOM Level 1
>      Recommendation [DOM] and, as far as possible, with the
>      XPointer Working Draft [XPointer], and with the XSL
>      Working Draft [XSL].

Why not SAX?  Clearly the event-driven nature of an XML
stream is important.  Will the standard support "push"
event-driven systems as well as "pull" object-oriented systems?

>  3.The XML Information Set shall contain sufficient
>      information for the creation of a well-formed XML
>      document.

Or stream fragement? or sub-document?  I'm only harping
beacuse not recognizing the other way to do things 
will severly limit the usefulness of the resulting product.

>  4.The XML Information Set shall contain sufficient
>      information to define equivalence for XML documents
>      based on their logical structure.

This I look forward to seeing. Isomorphism could be
very powerful.  I hope that a multi-pass mechanism
is not required for this feature.  Or if it is, 
the multi-pass requirement being limited to certain
cases of isomorphic forms.
> 
> 4. References
> 
> DOM
>      W3C (World Wide Web Consortium). Document Object
>      Model (DOM) Level 1 Specification Recommendation.
>      Version 1.0. [Cambridge, MA].
>      http://www.w3.org/TR/REC-DOM-Level-1
> XML
>      W3C (World Wide Web Consortium). Extensible Markup
>      Language (XML) Recommendation. Version 1.0.
>      [Cambridge, MA]. http://www.w3.org/TR/REC-xml
> XSL
>      W3C (World Wide Web Consortium). Extensible Stylesheet
>      Language (XSL) Working Draft. Version 1.0. [Cambridge,
>      MA]. http://www.w3.org/TR/WD-xsl
> XPointer
>      W3C (World Wide Web Consortium). XML Pointer
>      Language (XPointer) Working Draft. [Cambridge, MA].
>      http://www.w3.org/TR/WD-xptr


You forgot SAX and SAXON.  I find it rude not to 
take into account this non W3C standard.   It severely 
undervalues David's hudge contribution to the XML
community.

Clark Evans
Received on Thursday, 18 February 1999 22:12:52 UTC