Re: The effect on sequences containing whitespace text nodes, was: RE: [Bug 29692] xsl:strip-space and packages

The situation I'm thinking of is where you have a document

<doc>
  <a>  </a>
  <b>  </b>
</doc>

with <xsl:strip-space elements="a"/>

and initial match selection being the two nodes selected by /doc/*/node()

What is the result of the stylesheet

<xsl:template match="node()">
  <xsl:value-of select="position()"/>
</xsl:template>

I think it is "1" (not "1 2")

It's a ridiculous edge case that will never occur in practice, but something has to happen.

Michael Kay
Saxonica


> On 24 Jun 2016, at 18:01, Abel Braaksma <abel.braaksma@xs4all.nl> wrote:
> 
> Taking this discussion first offline, to prevent cluttering the bug entry too much, and with the intent to gather some more understanding of the issue.
> 
> I'm curious about stripping spaces/annotations from input sequences when these sequences are the initial match selection. It is my understanding that stripping spaces only applies to elements. 
> 
> Then you proposed the following:
>> <quote>
>> If the stripping process strips a whitespace text node that is present in the
>> sequence provided as the initial match selection, or in the sequence that
>> forms the result of the collection function, then the relevant node is
>> removed from this sequence.
>> </quote>
> 
> This seems to suggest that "the node is removed from the sequence". I don't see how that can happen.
> 
> 1) if the sequence contains a whitespace-only text node, the NameTest will never match against this node
> 2) if the sequence contains a document node containing a whitespace-only text node child, the NameTest will never match against this node
> 3) the text seems to suggest that if any ws-only text node (however deep) is removed from a node in the supplied sequence, the whole node is removed
> 
> Proposal:
> Remove this part, I don't think it can happen in practice (or can it?) and it may be confusing to readers. Or replace it with something like:
> 
> <quote>
> Note:
> If the sequence provided as the initial match selection contains items that are single whitespace text nodes, these nodes are not affected by this process. Document nodes that contain a single whitespace text node child will not be stripped either. This is a consequence of the mapping process: a NameTest can never match a single text node or the document node. Furthermore, stripping space or annotations only applies to elements.
> </quote>
> 
> Unless of course we want to give users freedom to remove text-nodes in a supplied sequence if such text-nodes are whitespace-only. But then we should change the requirement of NameTest and make it a NodeTest.
> 
> Cheers,
> Abel
> 
> 
>> -----Original Message-----
>> From: bugzilla@jessica.w3.org [mailto:bugzilla@jessica.w3.org]
>> Sent: Friday, June 24, 2016 4:04 PM
>> To: public-qt-comments@w3.org
>> Subject: [Bug 29692] xsl:strip-space and packages
>> 
>> https://www.w3.org/Bugs/Public/show_bug.cgi?id=29692
>> 
>> --- Comment #5 from Michael Kay <mike@saxonica.com> --- I propose the
>> following changes:
>> 
>> 1. Merge sections 4.4 and 4.5 into a single section (4.4 Preprocessing Source
>> Documents) with the current sections as subsections, and put shared
>> material in the introduction to this new section. Specifically:
>> 
>> <quote>
>> Source documents supplied as input to a transformation may be subject to
>> preprocessing. Two kinds of preprocessing are defined: stripping of type
>> annotations (see 4.4.1), and stripping of whitespace text nodes (see 4.4.2).
>> 
>> Stripping of type annotations happens before stripping of whitespace text
>> nodes.
>> 
>> The source documents to which this applies are as follows:
>> 
>> * The document containing the global context item if it is a node
>> 
>> * Any documents containing nodes present in the initial match selection
>> 
>> * Any document containing a node that is returned by the functions
>> document, docFO30, or collectionFO30
>> 
>> * Any document read using xsl:stream.
>> 
>> Note: this list excludes documents passed as the values of stylesheet
>> parameters or parameters of the initial template or function, trees created
>> by functions such as parse-xmlFO30, parse-xml-fragment, analyze-
>> stringFO30, or json-to-xml, and values returned from extension functions.
>> 
>> If a node other than a document node is supplied (for example as the global
>> context item), then the preprocessing is applied to the entire document
>> containing that node. If several nodes within the same document are
>> supplied (for example as nodes in the initial match selection, or as nodes
>> returned by the collection function), then the preprocessing is only applied
>> to that document once.
>> 
>> The rules determining whether or not stripping of annotations and/or
>> whitespace happens are defined at the level of a package. Declarations
>> within a library package only affect the handling of documents loaded using a
>> call on the document, docFO30, or collectionFO30 functions or an evaluation
>> of an xsl:stream instruction appearing lexically within the same package.
>> Declarations within the top-level package also affect the processing of the
>> global context item and the initial match selection.
>> 
>> The semantics of the doc, document, and collection functions are formally
>> defined in terms of mappings from URIs to document nodes maintained
>> within the dynamic context. The effect of the declarations that control
>> stripping of type annotations and whitespace is therefore to modify this
>> mapping (so it now maps the URI to a stripped document). The modification
>> applies to the dynamic context for calls to these function appearing within a
>> particular package; each package therefore has a different set of mappings.
>> This means that when two calls to the doc function appear in different
>> packages, specifying the same absolute URI, then in general different
>> documents are returned. An implementation MAY return the same
>> document if it is able to determine that the effect of the annotation and
>> whitespace stripping rules in both packages is the same.
>> 
>> The effect of dynamic calls to the doc, document, and collection functions is
>> defined in the same way as for other functions with dependencies on the
>> dynamic context. As described in 5.3.4, named function references (such as
>> doc#1) and calls on function-lookupFO30 (for example, function-
>> lookup("doc", 1)) are defined to retain the XPath static and dynamic context
>> at the point of invocation as part of the closure of the resulting function item,
>> and to use this preserved context when a dynamic function call is
>> subsequently made using the function item.
>> 
>> </quote>
>> 
>> 2. In 4.4, delete this paragraph:
>> 
>> <quote>
>> The source trees to which this applies are the same as those affected by
>> xsl:strip-space and xsl:preserve-space: see 4.5 Stripping Whitespace from a
>> Source Tree. As with whitespace stripping, the rules for stripping of type
>> annotations may vary from one package to another, and have the effect of
>> modifying the mapping from URIs to document nodes defined in the XPath
>> dynamic context; this means that two calls to the docFO30 function (for
>> example) supplying the same URI may produce different document nodes if
>> the calls appear in different packages.
>> </quote>
>> 
>> 3. Inn 4.5, delete the following paragraphs:
>> 
>> <quote>
>> For the purposes of this section, the term source tree means the document
>> containing the global context item if it is a node, any documents containing
>> nodes present in the initial match selection, any document returned by the
>> functions document, docFO30, or collectionFO30, and any document read
>> using xsl:stream. It does not include documents passed as the values of
>> stylesheet parameters or parameters of the initial template or function,
>> trees created by functions such as parse-xmlFO30, parse-xml-fragment,
>> analyze-stringFO30, or json-to-xml, nor values returned from extension
>> functions.
>> 
>> Each source tree is associated with a package: the relevant package for the
>> global context item is the top-level package; the relevant package for a call
>> on document, docFO30, or collectionFO30; is the package in which that call
>> appears; and the relevant package for evaluation of xsl:stream is the package
>> in which that instruction appears.
>> </quote>
>> 
>> Change the following paragraph:
>> <quote>
>> Formally, the stripping process modifies the mapping from URIs to document
>> nodes defined in the XPath dynamic context. This mapping can therefore
>> vary from one package to another. The mapping that applies to a particular
>> call on document, docFO30, or collectionFO30, or a particular evaluation of
>> xsl:stream, is affected by the xsl:strip-space and xsl:preserve-space
>> declarations within the package in which that construct appears. This means
>> that two calls on the
>> docFO30 function (for example) may return different nodes if the calls
>> appear in different packages.
>> </quote>
>> 
>> to
>> <quote>
>> The stripping process that applies for a particular package is determined by
>> the xsl:strip-space and xsl:preserve-space declarations within that package.
>> </quote>
>> 
>> After
>> <quote>
>> The xml:space attributes are not removed from the tree.
>> </quote>
>> 
>> add:
>> <quote>
>> If the stripping process strips a whitespace text node that is present in the
>> sequence provided as the initial match selection, or in the sequence that
>> forms the result of the collection function, then the relevant node is
>> removed from this sequence.
>> </quote>
>> 
>> Delete the following paragraph:
>> <quote>
>> The effect of xsl:strip-space and xsl:preserve-space is local to the package in
>> which they appear. Declarations within a library package only affect the
>> handling of documents loaded using a call on the document, docFO30, or
>> collectionFO30 functions or an evaluation of an xsl:stream instruction
>> appearing lexically within the same package. Declarations within the top-level
>> package also affect the processing of the main input document.
>> </quote>
>> 
>> --
>> You are receiving this mail because:
>> You are the QA Contact for the bug.
> 
> 

Received on Saturday, 25 June 2016 07:36:39 UTC