RE: Implications of using XPointer for XInclude from Jonathan Marsh on 2002-01-07 (www-xml-xinclude-comments@w3.org from January 2002)

From: Jonathan Marsh <jmarsh@microsoft.com>
Date: Mon, 7 Jan 2002 10:58:06 -0800
To: "Simon St.Laurent" <simonstl@simonstl.com>
Cc: <www-xml-xinclude-comments@w3.org>
Message-ID: <330564469BFEC046B84E591EB3D4D59C0454FF4A@red-msg-08.redmond.corp.microsoft.com>
Sorry for the long delay - we're at last preparing XInclude for CR, and
I found that we had not responded to your comment.

XPointer dependency is a significant matter of debate within the WG as
well as in the industry.  Many agreed that XPointer added significant
processing costs.  But many (often the same people) felt that dropping
XPointers altogether was a loss of crucial functionality.  Ideally,
these issues could be addressed in XPointer but that doesn't seem likely
to happen.

Faced with these choices, the Core WG decided to keep XPointer in the CR
draft, but make XPointer support part of the exit criteria.  At the same
time we will solicit implementer feedback on useful subset of XPointer.
If the worries about XPointer are unfounded, we can keep it.  If they
prove to be well founded, we will have to revisit this issue.

Although your comment does not appear to strictly be implementer
feedback, we will consider it as part of our CR phase.

Thank you,
Jonathan Marsh

> -----Original Message-----
> From: Simon St.Laurent [mailto:simonstl@simonstl.com]
> Sent: Wednesday, August 22, 2001 5:51 AM
> To: xml-dev@lists.xml.org
> Cc: www-xml-linking-comments@w3.org; www-xml-xinclude-comments@w3.org
> Subject: Implications of using XPointer for XInclude
> 
> [Apologies for the cross-post.  This discussion started on xml-dev,
but
> has clear relevance to www-xml-linking-comments for XPointer and
> www-xml-xinclude-comments for XInclude.]
> 
> The use of XPointer [1] by XML Inclusions (XInclude) [2] has some
> processing implications which substantially increase the cost
> (development, CPU cycles, memory) of a conformant implementation of
> XInclude.
> 
> Section 4.2 of the XInclude spec [3] states that:
> >When parsing as XML, the fragment part of the URI reference is
> >interpreted as an XPointer [XPointer], regardless of the media type
of
> >the resource. The XPointer indicates a subresource as the target for
> >inclusion.
> 
> Section 4.2 then goes on at length regarding the results, legal and
> illegal, of various kinds of XPointer processing and how they should
or
> should not be included in the document.  Multiple node responses and
> ranges are explicitly legal.
> 
> While XInclude seems quite capable in a processing environment where
> full XPointer support is provided, the nature of that environment and
> some of the situations that environment will have to handle are worth
> questioning.
> 
> Because "XPointer is built on top of the XML Path Language[4],"
> XPointer includes all of XPath and then some.  Unlike the use of XPath
> in W3C XML Schema [5], there is no restriction on the XPath
expressions
> or axes supported.
> 
> As a result, XPointers (and hence XInclude expressions) can include
> XPaths which reference for instance, the preceding, or
preceding-sibling
> axes.  The use of these axes requires tree-building processing, as
they
> cannot be reliably processed in a stream-oriented environment.
> 
> Stream-processing has one substantial advantage over tree-building: a
> considerably smaller memory footprint.  The 'classic' example of Jon
> Bosak's Old Testament XML file [6], which is 3.3 MB of structured
text,
> no longer containing the chapter and verse information explicitly, is
> both a large document and one from which people may reasonably choose
to
> cite passages.  As documents tend to grow when stored in object trees,
> having to process this document _as a tree_ in order to extract
> fragments from it could be a very substantial burden, even if the
> document is stored locally.
> 
> Processing environments which implement XInclude fully, even if they
are
> themselves capable of working in a stream-based environment [7], are
> going to have to deal with this potential for tree-building.  There
are
> a few possible strategies:
> 
> 1) Implement XInclude on top of complete (or "nearly complete")
XPointer
> support and accept the tree-building expense. (appears to be the
current
> approach of libxml [8].)
> 
> 2) Implement a subset of XPointer on something like the subset defined
> by W3C XML Schema Structures [5], supporting only the child and/or
> attribute axes and possibly though not necessarily the string-based
> capabilities.  Using that subset, apply stream-processing to XIncluded
> documents and include the portions needed without building trees.
> 
> 3) Use a mixture of strategies 1 and 2, analyzing all XPointers to
> determine which axes are used and only building the tree (or even a
> subset of the tree) if necessary.  Reduces memory impact at the cost
of
> program complexity and redundancy.
> 
> 4) Subset the specification so as to ignore fragment identifiers
> (appears to be the current approach of [7]).
> 
> While a 3K wrapper including a verse from Leviticus in the Old
Testament
> (via XInclude) may seem like something of an edge case, I have a
> difficult time describing it as unreasonable or unlikely.
> 
> These problems are not shared by FIXptr[9], which is effectively a
> conservative version of strategy 2.  A similar approach could be built
> on the XPath subset defined in W3C XML Schema Structures [5].
> 
> Similar issues regarding XInclude's use of XPointer appear to have
been
> rejected by the XLink WG [10] ("the WG was unwilling to give up
XPointer
> support"), but I would hope that processing considerations might
reopen
> that discussion.
> 
> [1] - http://www.w3.org/TR/xptr
> [2] - http://www.w3.org/TR/xinclude
> [3] - http://www.w3.org/TR/xinclude/#xml-included-items
> [4] - http://www.w3.org/TR/xpath
> [5] - http://www.w3.org/TR/xmlschema-1/#coss-identity-constraint
> [6] - archived in http://metalab.unc.edu/bosak/xml/eg/rel200.zip
> 
> [7] - http://www.ibiblio.org/xml/XInclude/
> [8] - http://xmlsoft.org/
> [9] -
> http://lists.w3.org/Archives/Public/www-xml-linking-
> comments/2001AprJun/att-0074/01-NOTE-FIXptr-20010425.htm
> [10] -
> http://lists.w3.org/Archives/Public/www-xml-xinclude-
> comments/2001Aug/0004.html
Received on Monday, 7 January 2002 13:58:38 UTC