RE: The effect on sequences containing whitespace text nodes, was: RE: [Bug 29692] xsl:strip-space and packages

Interesting...

It honestly never occurred to me that the nodes in the initial match selection do not have to be rooted nodes. It looks like we recognized this issue earlier, as we currently say:

"For the purposes of this section, the term source tree means the document containing the global context item if it is a node, any documents containing nodes present in the initial match selection,..." 

I agree with your assessment of this (albeit rather absurd) edge case and that the only sensible thing to do is to remove such nodes from the sequence, or to set the GCI to the empty sequence.

You wrote:
> <quote>
> If the stripping process strips a whitespace text node that is present in the
> sequence provided as the initial match selection, or in the sequence that
> forms the result of the collection function, then the relevant node is
> removed from this sequence.
> </quote>

I'm still struggling with "text node that is present in the sequence". I am worried about the sentence possibly being interpreted as "at any depth, in a node, that is present in the sequence". I know we don't mean that.

New suggestion:

<proposal>
If the stripping process strips a whitespace text node that is itself an item, but not a child of an item, in the sequence of the initial match selection or the sequence that forms the result of fn:collection, or such whitespace text node is the global context item, then the relevant node is set to the empty sequence.

Note:
This has not the effect that in such case either the initial match selection or the global context item becomes *absent*, instead, it may result in the either being set to the empty sequence.
</proposal>

This would fix (hopefully) the perceived ambiguity. And it adds the GCI to the sentence, plus a Note which is aimed to remove the suggestion that a possible error could be thrown as a result of whitespace stripping (i.e., this should not raise an error with apply-templates invocation because of the absence of an IMS or GCI, it may, however, result in no matches at all)

I wouldn't mind an additional Note that explains this with your example below.

Thanks,
Abel



> -----Original Message-----
> From: Michael Kay [mailto:mike@saxonica.com]
> Sent: Saturday, June 25, 2016 9:36 AM
> To: Abel Braaksma
> Cc: Public XSLWG
> Subject: Re: The effect on sequences containing whitespace text nodes, was:
> RE: [Bug 29692] xsl:strip-space and packages
> 
> The situation I'm thinking of is where you have a document
> 
> <doc>
>   <a>  </a>
>   <b>  </b>
> </doc>
> 
> with <xsl:strip-space elements="a"/>
> 
> and initial match selection being the two nodes selected by /doc/*/node()
> 
> What is the result of the stylesheet
> 
> <xsl:template match="node()">
>   <xsl:value-of select="position()"/>
> </xsl:template>
> 
> I think it is "1" (not "1 2")
> 
> It's a ridiculous edge case that will never occur in practice, but something has
> to happen.
> 
> Michael Kay
> Saxonica
> 
> 
> > On 24 Jun 2016, at 18:01, Abel Braaksma <abel.braaksma@xs4all.nl> wrote:
> >
> > Taking this discussion first offline, to prevent cluttering the bug entry too
> much, and with the intent to gather some more understanding of the issue.
> >
> > I'm curious about stripping spaces/annotations from input sequences when
> these sequences are the initial match selection. It is my understanding that
> stripping spaces only applies to elements.
> >
> > Then you proposed the following:
> >> <quote>
> >> If the stripping process strips a whitespace text node that is present in the
> >> sequence provided as the initial match selection, or in the sequence that
> >> forms the result of the collection function, then the relevant node is
> >> removed from this sequence.
> >> </quote>
> >
> > This seems to suggest that "the node is removed from the sequence". I
> don't see how that can happen.
> >
> > 1) if the sequence contains a whitespace-only text node, the NameTest will
> never match against this node
> > 2) if the sequence contains a document node containing a whitespace-only
> text node child, the NameTest will never match against this node
> > 3) the text seems to suggest that if any ws-only text node (however deep)
> is removed from a node in the supplied sequence, the whole node is
> removed
> >
> > Proposal:
> > Remove this part, I don't think it can happen in practice (or can it?) and it
> may be confusing to readers. Or replace it with something like:
> >
> > <quote>
> > Note:
> > If the sequence provided as the initial match selection contains items that
> are single whitespace text nodes, these nodes are not affected by this
> process. Document nodes that contain a single whitespace text node child
> will not be stripped either. This is a consequence of the mapping process: a
> NameTest can never match a single text node or the document node.
> Furthermore, stripping space or annotations only applies to elements.
> > </quote>
> >
> > Unless of course we want to give users freedom to remove text-nodes in a
> supplied sequence if such text-nodes are whitespace-only. But then we
> should change the requirement of NameTest and make it a NodeTest.
> >
> > Cheers,
> > Abel
> >
> >
> >> -----Original Message-----
> >> From: bugzilla@jessica.w3.org [mailto:bugzilla@jessica.w3.org]
> >> Sent: Friday, June 24, 2016 4:04 PM
> >> To: public-qt-comments@w3.org
> >> Subject: [Bug 29692] xsl:strip-space and packages
> >>
> >> https://www.w3.org/Bugs/Public/show_bug.cgi?id=29692
> >>
> >> --- Comment #5 from Michael Kay <mike@saxonica.com> --- I propose
> the
> >> following changes:
> >>
> >> 1. Merge sections 4.4 and 4.5 into a single section (4.4 Preprocessing
> Source
> >> Documents) with the current sections as subsections, and put shared
> >> material in the introduction to this new section. Specifically:
> >>
> >> <quote>
> >> Source documents supplied as input to a transformation may be subject
> to
> >> preprocessing. Two kinds of preprocessing are defined: stripping of type
> >> annotations (see 4.4.1), and stripping of whitespace text nodes (see
> 4.4.2).
> >>
> >> Stripping of type annotations happens before stripping of whitespace text
> >> nodes.
> >>
> >> The source documents to which this applies are as follows:
> >>
> >> * The document containing the global context item if it is a node
> >>
> >> * Any documents containing nodes present in the initial match selection
> >>
> >> * Any document containing a node that is returned by the functions
> >> document, docFO30, or collectionFO30
> >>
> >> * Any document read using xsl:stream.
> >>
> >> Note: this list excludes documents passed as the values of stylesheet
> >> parameters or parameters of the initial template or function, trees
> created
> >> by functions such as parse-xmlFO30, parse-xml-fragment, analyze-
> >> stringFO30, or json-to-xml, and values returned from extension functions.
> >>
> >> If a node other than a document node is supplied (for example as the
> global
> >> context item), then the preprocessing is applied to the entire document
> >> containing that node. If several nodes within the same document are
> >> supplied (for example as nodes in the initial match selection, or as nodes
> >> returned by the collection function), then the preprocessing is only
> applied
> >> to that document once.
> >>
> >> The rules determining whether or not stripping of annotations and/or
> >> whitespace happens are defined at the level of a package. Declarations
> >> within a library package only affect the handling of documents loaded
> using a
> >> call on the document, docFO30, or collectionFO30 functions or an
> evaluation
> >> of an xsl:stream instruction appearing lexically within the same package.
> >> Declarations within the top-level package also affect the processing of the
> >> global context item and the initial match selection.
> >>
> >> The semantics of the doc, document, and collection functions are formally
> >> defined in terms of mappings from URIs to document nodes maintained
> >> within the dynamic context. The effect of the declarations that control
> >> stripping of type annotations and whitespace is therefore to modify this
> >> mapping (so it now maps the URI to a stripped document). The
> modification
> >> applies to the dynamic context for calls to these function appearing within
> a
> >> particular package; each package therefore has a different set of
> mappings.
> >> This means that when two calls to the doc function appear in different
> >> packages, specifying the same absolute URI, then in general different
> >> documents are returned. An implementation MAY return the same
> >> document if it is able to determine that the effect of the annotation and
> >> whitespace stripping rules in both packages is the same.
> >>
> >> The effect of dynamic calls to the doc, document, and collection functions
> is
> >> defined in the same way as for other functions with dependencies on the
> >> dynamic context. As described in 5.3.4, named function references (such
> as
> >> doc#1) and calls on function-lookupFO30 (for example, function-
> >> lookup("doc", 1)) are defined to retain the XPath static and dynamic
> context
> >> at the point of invocation as part of the closure of the resulting function
> item,
> >> and to use this preserved context when a dynamic function call is
> >> subsequently made using the function item.
> >>
> >> </quote>
> >>
> >> 2. In 4.4, delete this paragraph:
> >>
> >> <quote>
> >> The source trees to which this applies are the same as those affected by
> >> xsl:strip-space and xsl:preserve-space: see 4.5 Stripping Whitespace from
> a
> >> Source Tree. As with whitespace stripping, the rules for stripping of type
> >> annotations may vary from one package to another, and have the effect
> of
> >> modifying the mapping from URIs to document nodes defined in the
> XPath
> >> dynamic context; this means that two calls to the docFO30 function (for
> >> example) supplying the same URI may produce different document
> nodes if
> >> the calls appear in different packages.
> >> </quote>
> >>
> >> 3. Inn 4.5, delete the following paragraphs:
> >>
> >> <quote>
> >> For the purposes of this section, the term source tree means the
> document
> >> containing the global context item if it is a node, any documents
> containing
> >> nodes present in the initial match selection, any document returned by
> the
> >> functions document, docFO30, or collectionFO30, and any document read
> >> using xsl:stream. It does not include documents passed as the values of
> >> stylesheet parameters or parameters of the initial template or function,
> >> trees created by functions such as parse-xmlFO30, parse-xml-fragment,
> >> analyze-stringFO30, or json-to-xml, nor values returned from extension
> >> functions.
> >>
> >> Each source tree is associated with a package: the relevant package for
> the
> >> global context item is the top-level package; the relevant package for a
> call
> >> on document, docFO30, or collectionFO30; is the package in which that call
> >> appears; and the relevant package for evaluation of xsl:stream is the
> package
> >> in which that instruction appears.
> >> </quote>
> >>
> >> Change the following paragraph:
> >> <quote>
> >> Formally, the stripping process modifies the mapping from URIs to
> document
> >> nodes defined in the XPath dynamic context. This mapping can therefore
> >> vary from one package to another. The mapping that applies to a
> particular
> >> call on document, docFO30, or collectionFO30, or a particular evaluation of
> >> xsl:stream, is affected by the xsl:strip-space and xsl:preserve-space
> >> declarations within the package in which that construct appears. This
> means
> >> that two calls on the
> >> docFO30 function (for example) may return different nodes if the calls
> >> appear in different packages.
> >> </quote>
> >>
> >> to
> >> <quote>
> >> The stripping process that applies for a particular package is determined
> by
> >> the xsl:strip-space and xsl:preserve-space declarations within that
> package.
> >> </quote>
> >>
> >> After
> >> <quote>
> >> The xml:space attributes are not removed from the tree.
> >> </quote>
> >>
> >> add:
> >> <quote>
> >> If the stripping process strips a whitespace text node that is present in the
> >> sequence provided as the initial match selection, or in the sequence that
> >> forms the result of the collection function, then the relevant node is
> >> removed from this sequence.
> >> </quote>
> >>
> >> Delete the following paragraph:
> >> <quote>
> >> The effect of xsl:strip-space and xsl:preserve-space is local to the package
> in
> >> which they appear. Declarations within a library package only affect the
> >> handling of documents loaded using a call on the document, docFO30, or
> >> collectionFO30 functions or an evaluation of an xsl:stream instruction
> >> appearing lexically within the same package. Declarations within the top-
> level
> >> package also affect the processing of the main input document.
> >> </quote>
> >>
> >> --
> >> You are receiving this mail because:
> >> You are the QA Contact for the bug.
> >
> >
> 

Received on Saturday, 25 June 2016 12:17:18 UTC