Re: Static streamability, regarding bug 29984 from C. M. Sperberg-McQueen on 2016-12-08 (public-xsl-wg@w3.org from December 2016)

From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
Date: Wed, 7 Dec 2016 18:51:28 -0700
To: public-xsl-wg@w3.org
Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
Message-Id: <9DBBB801-EB5B-43BE-AF61-1EB1B62A42C4@blackmesatech.com>
[Apologies for the length of this mail.  I have now spent so much
time drafting it that I lack the time to cut it back down to reasonable
size.]

On 1 December 2016, Abel Braaksma wrote:

> During today's meeting we discussed Bug 29984 which is about a
> "feature at risk" in the latest public CR, namely the ability to
> enforce strict raising of XTSE3430 even in cases where a construct
> is obviously streamable, but our rules say it isn't.

It's a minor point, but it's not clear to me that any constructs at
all are "obviously" streamable.  I think what is important for your
argument is that we believe that they are streamable in fact.

A second point, not a minor one, also needs to be made. Our rules do
not say that any construct is not streamable.  They distinguish
between things that are guaranteed streamable and things that are not.
The only inference licensed about streamability is that any construct
guaranteed streamable under our rules is (we think) streamable.  The
inverse inference is unsound.

> After the discussion, MSM wrote the following thoughts in IRC:

>     [...] But if we are going to remove the interoperability rule so
>     thoroughly, why are we defining a concept of guaranteed-streamable
>     in the first place? If that definition has become pointless, as
>     well as far more complex than I wish it were, then why not forget
>     it entirely?

> We didn't address this comment and concern in depth, but I would like
> to try to respond to it and hope to remove some of the concerns
> raised.

> We currently have two sets:

> (1) the set of guaranteed-streamable constructs
> (2) the set of non-guaranteed-streamable constructs

So far, so good.  In the interests of finding common ground, let me
state my beliefe that 

  (a) sets (1) and (2) are disjoint

and that 

  (b) every construct in XSLT is in either set (1) or set (2),
      i.e. the union of sets (1) and (2) is the set of constructs in
      XSLT.

If these are not true statements, please instruct me.

> The current rules are:

> A) If any construct falls within set (1), it is *always*
>    guaranteed-streamable, regardless of processor.

I would go slightly further:  it’s not just that every processor
will agree that it falls into the class “guaranteed streamable”,
but also that every conformant streaming processor is
guaranteed to process it in a streaming way.

> B) If any construct falls within set (2), some processor *may* be able
>    to process it in a streamable way

> The issue in Bug 29984 defines a third set, by two definitions:

> 3a) the set of constructs that by static rewriting is
>     guaranteed-streamable
> 3b) the set of constructs that by static analysis never accesses a
>     streamed node

> In the bug report Michael Kay showed an example of a construct that is
> not an expression but is still trivially streamable, yet not by our
> rules unless we allowed (3a). 

I think you mean it is trivially streamable, but not guaranteed
streamable under our rules.  Unless we have changed our design
radically while I was not looking, our rules never say that something
is not streamable.

> The rule of (3b) is currently in our
> spec in some places (i.e. on axis steps). An (extreme) example is
> (foo, bar)[0], which always returns the empty sequence.

> If we were to accept either (or both) these rules we can say:

> C) If any construct falls within set (3a) or (3b) it is *always*
> guaranteed streamable, but it is processor-dependent whether this is
> detected

I think that's true; we could say that.

The problem is that in that case I no longer see the point of defining
the class of guaranteed streamable constructs.

What kind of "guarantee" is it, if processors are not guaranteed to
stream the construct in question?

One important consequence of the current rules, from my point of view
(others may have had their own reasons for favoring them), is that
they ensure that if I as a user have access to a single conformant
streaming processor, I can write stylesheets which will be streamed by
any and every conformant streaming processor.  This has two advantages
for me as a user.  

  - The minor advantage is that if my production processor has poor
    error messages (in general, or for a given problem), I can run the
    stylesheet through a different conforming streaming processor and
    get different (and perhaps more informative) error messages or
    diagnostics.  Even if the messages are not better in isolation,
    having two different messages concerning the same problem can in
    my experience make it easier to understand what the problem is.
    (I routinely use Saxon and xsltproc to debug XSLT 1.0 stylesheets
    which are to run in Web browsers, since I often find browser error
    messages unhelpful.  I expect the same technique to be helpful
    when trying to make a recalcitrant stylesheet stream properly.)

  - The major advantage is that it helps protect me and the users of
    my stylesheets against dependency on a single vendor or supplier.
    My first serious computing work was on IBM mainframes; while I was
    and remain a great fan of some of the environments I worked in, I
    also know what vendor lock-in feels like and looks like from the
    inside.  One of the reasons I became interested in standards work
    was that it seemed to me to offer a way to reduce the risk of
    lock-in.

One spec which I think did an excellent job of supporting
interoperability without making vendor extension and adaptation
impossible was the spec for Pascal.  The Pascal Report says in section
14 ("Compliance with ISO 7185"):

    A processor complies with the standard if it satisfies all of the
    following conditions:
  
    (a) It accepts all features of the language as they are defined in
    the standard. ...

    ...

    (e) It is able to process as an error any use of an extension or
    of an implementation-defined feature.

    ...

(Please note in passing that I am not equally enamored of the Pascal
Report's rules on error handling, and I hold no brief for conformant
arrays.)

From a spec-drafting point of view, I admire the simple,
straightforward formulation of (e) here.  That rule serves an
important purpose for any user of Pascal with a conforming compiler:
it allows the user to write programs that do not depend on extensions
or implementation-defined features.  And it is a similar purpose that
I hoped to serve by supporting the definition in our spec of a
"guaranteed streamable" subset of our language and the requirement
that all processors be able to report non-guaranteed-streamable
constructs to the user: the user should be able to write streaming
stylesheets that do not depend on vendor extensions or
implementation-defined or -dependent behavior or features.

> Therefore, this technically falls within the "may be able to stream"
> of (B), but the only difference being we don't require the processor
> to have a user option to raise an error in this case.

I believe I understand what is proposed here.

I believe I understand why you find it attractive: it allows you to be
as aggressive as you like with rewriting and other optimizations, and
to apply the rules of guaranteed streamable to the rewritten code (or
to an abstract syntax tree that has been restructured, which for my
purposes amounts to the same thing), without having to keep track of
the form taken by a construct before it was rewritten.  

I believe that having to apply the rules of guaranteed streamability
to expressions in their un-rewritten form makes the task of
implementing a streaming processor harder (but see questions below),
and I can imagine that the gain may seem very small, so the
cost/benefit ratio seems too high.

However, it seems to me as a WG member that what we have here is a
straightforward tradeoff between the interests of implementors in
simpler implementation (I won't mention lock-in, but I cannot
recollect ever encountering an implementor who was adversely affected
commercially or psychologically by having users who could not move to
a different implementation) and the interests of users in portability.

If we keep the status quo, then we impose a burden on all implementors
of conforming processors, in order to preserve for users the ability
to avoid dependence on a specific implementation.

If we adopt the change proposed (as I understand it), we relieve
implementors of that burden, and in exchange we abandon the goal of
ensuring that users with access to a conforming streaming processor
will be able to use that conforming processor to check whether their
streaming stylesheets will stream in every (conforming) streaming
processor.

In that tradeoff, I am unapologetically on the side of the user.  I do
not believe that users are well served by invisible dependencies on
specific implementations.

The third interested party here is the reader of the specification. If
we adopt the change proposed, then the normative role of guaranteed
streamability is undercut: conforming processors are no longer
required to patrol the boundary between guaranteed-streamable
constructs and other constructs, and users will be on their own.

Michael Kay has said (in comment 4 on bug 29984) that the concept of
guaranteed streamability remains "far from pointless".  He may be
right, but at the moment I don't see it.

If we don't require conforming processors to report on non-portable
(i.e. non-guaranteed-streamable) constructs in streaming contexts,
then perhaps free-standing tools for diagnosing deviations from
guaranteed streamability will be possible (like the one John Lumley
described a couple of years ago at Balisage); perhaps they will be
made available.  Or perhap not.  

> Extending our rules this way still allows the interoperability given
> by sets (1) and (2). And users that want their stylesheets to be
> guaranteed streamable should simply stick to set (1). 

My difficulty with this advice is that in my view of the world, my
ability to stick to set (1) is a consequence of the rule in the spec
that ensures that if I have any doubt about whether a given construct
is streamable, any conforming processor which supports the streaming
feature is in a position to tell me whether it is or is not guaranteed
streamable.  

Without that guarantee, your advice seems all too like the advice
Wittgenstein gave when explaining why he thought having contradictions
in one's logic was not really important: just don't use the
contradictions to derive absurd results, and you'll be fine.  It is
the rules of logic which I rely upon to protect me from the derivation
of bogus results; it is the rules of error reporting by conforming
processors which I rely upon to protect me from inadvertently writing
non-conforming stylesheets, and from inadvertently depending on 
processor-specific streaming analysis.

> Should they
> consciously use set (3a/3b)? I don't think so. But if they
> unconsciously happen to be in that area, we don't require a processor
> to interrupt processing and raise an error.

It seems to me that this means that if I use processor A to check my
streaming stylesheet, and get no errors about constructs not
guaranteed streamable, then processor B is no longer guaranteed to
process my stylesheet in a streaming way.  

After all, processors A and B may use different sets of rewriting
rules.

Am I wrong?

If I am wrong, can  you make me see my error?

If I'm not wrong, do you believe that that's a desirable state of
affairs?  An undesirable but unavoidable state of affairs?  or ... ?

If a user asks "how do I write streamable stylesheets that will work
with any conforming processor which supports the streaming feature?",
and our only answer is "Well, no processor is required to diagnose any
not-guaranteed-streamable constructs, but if you just use the rules in
section 19 to hand-check your code, you'll be fine", my prediction is
that users will take one look at those rules and laugh in our faces at
the idea of checking them by hand.

Do you expect a different reaction?  Or do you believe we have a
better answer than "check your own code, by hand"?  If so, what is it?

> Allowing such rewrites is very similar to the way XPath allows
> rewrites that raise, or do not raise errors. That specification has
> gone to great lengths to allow optimizations and allow a certain
> leeway in whether or not a processor raises an exception. Does this
> mean that the concept of having exceptions is futile? I don't think
> so. Even if it means that certain tests have no guaranteed outcome
> (i.e. (1, 2/0)[1] may or may not raise an error).

If an instruction like 

  <xsl:apply-templates select="2/0[1]"/>

is not guaranteed to raise an error when not in code provably
unreachable, then I regret that I failed to object to whatever changes
made that possible.

> I think it is a good thing if we allow a similar kind of leeway with
> streamability. It is a far fuzzier subject than error raising, yet we
> are trying to be more strict about it. I don't think that is in the
> interest of end-users, optimizing implementations and future
> improvements to algorithms or type systems.

I believe that the crucial leeway to offer here is the ability to
stream anything if you can figure out how to stream it.  We have
worked hard to make it clear that implementations have that freedom.

I don't see why the equivalent of rule (e) in the Pascal Report is a
barrier to improving algorithms or type systems.  Nothing prevents an
implementation from improving its algorithms however it can, nor
should it.  I don't want to prevent your finding a better way to
stream things.  I do want to prevent my becoming dependent on your
implementation and unable to move to a different one because you
didn't warn me that my stylesheet was not guaranteed to be processable
by a different processor. (If I knew what I was getting into, that's a
different story.  Vendor dependencies I'm aware of can be documented
and dealt with when necessary.  It's vendor dependencies the user is
not and cannot be aware of that I would like to minimize.)

> By allowing the leeway only in a very limited set of provable
> expressions and constructs, I think we increase in usability and
> testability. 

What part of the proposal in bug 29984 identifies a limited set of
rewrites?

If we did identify a limited set of allowable or required rewrites,
which all processors were required to perform, and more than which
they were not allowed to perform, would not that stand in the way of
future improvements to algorithms and type systems?

If we don't define such a limited set of rewrites, how will a user
distinguish among

  - a stylesheet which is guaranteed to be streamable by any
    streaming processor

  - a stylesheet which *this* processor can stream, and which issues
    no messages about non-guaranteed-streamable constructs, because
    this processor has silently rewritten some non-GS expressions in
    GS form, but which is not guaranteed streamable by all streaming
    processors, because its streamability is a consequence of a very
    clever, slightly tricky rewrite rule which not all processors use

  - a stylesheet which is not guaranteed streamable as it sits, and
    not statically equivalent to a guaranteed-streamable stylesheet,
    but which *this* processor knows how to stream because it has a
    new algorithm based on quantum computing, which looks like magic
    to those who don't understand quantum computing

If a user cannot distinguish among these classes, what work is the
concept of guaranteed streamability doing in our spec?

> We no longer need a test that proves that (A + A) is
> non-streamable because we can show that (2 * A) is the same and is
> streamable, we can allow it be both streamable and non-streamable. The
> same way we allow it to sometimes raise an error and sometimes not.

You seem to be confusing here the concepts of streamability in fact
and guaranteed streamability.  Again.  Please don't do that.  If we
start confusing those two concepts -- or rather, if we start confusing
them yet again -- then this discussion is going to take even longer
and cause even higher temperatures.

It's quite clear that (A + A) can in principle be rewritten as (2 * A)
-- actually, it's not quite clear to me, but I'm going to ignore the
possibility of cases in which A+A and 2*A don't give the same results
because of arithmetic overflow, underflow, or type coercions -- and
thus is *streamable*.  That it is not *guaranteed streamable* is a
consequence of our not wishing to require specific optimizations from
processors.  Nothing we can do could possibly prove or require that (A
+ A) be non-streamable.

The point at issue is merely whether, given that it is not guaranteed
streamable, a processor should be required to report to the user that
the expression in line so-and-so is not guaranteeed streamable.  As a
user, I find such reports helpful; they help me avoid being locked in
to a single implementation.  

Since you tell me that they are a burden on an implementation, I'll
believe you.

But two observations make me think that the burden imposed on
implementors by the status quo is not all that heavy.

First -- if for some complex XPath expression A you rewrite (A + A)
internally as (2 * A), and then at some point discover that there is
an inescapable problem in expression A, what message do you give to
the user?  "Error in expression '(2 * /x/y/*/z/descendant::*/3/4[1])'"
is going to raise some eyebrows, if what was in the submitted
stylesheet was ($nwidgets + $nsprockets).  Is it really possible to
write useful error messages if you have lost all connection between
the user's original formulation of the expression and the rewritten
tree you are working with internally?

As a user of language-aware tools, I expect that despite your claim
that by the time it comes to testing for guaranteed streamability, you
no longer know what the user's expression actually was, your code
probably does know quite well what the user's expression was.  I may
be wrong, of course.

Second -- in the worst-case scenario, what would be involved in
conforming to the status quo rule?  I imagine several approaches are
possible.  (1) When the --report-gs-violations flag is set, certain
rewrite rules are rendered inactive, since you have established that
they can hide GS violations.  This might be inconvenient, if it makes
it impossible for the rewrite rules to reduce every expression to an
expression in the kernel syntax otherwise used by the implementation.
(2) When the --report-gs-violations flag is set, you do
*streamability* analysis once for real on the rewritten expressions
and you do *guaranteed streamability* analysis in a separate pass over
the user's expresssions, for the sake of the flag, cursing under your
breath all the while at the waste of effort.  In the extreme case, you
call an external program to do the guaranteed-streamability analysis,
and otherwise ignore the flag.

Michael Kay has explicitly said, in comment 4 of bug 29984, that he
thinks these are not realistic implementation strategies.  But I think
his remark boils down to saying that he doesn't want to do it that
way.


> I can understand that there's some resentment against this
> proposal. In fact, for a long time I was against it myself. However, I
> think there's room for improvement here without opening up the
> strictness of guaranteed-streamability too much, if at all (one can
> wonder, if a + b == b + a, and we allow a + b, but disallow b + a, we
> may have been writing the wrong kind of strictness in the spec, this
> bug is trying to address that).

You may be right.  I'm not sure "resentment" is the word I would have
used.  There is certainly "resistance"; the change proposed in bug
29984 seems to me to be trying to change a direction we established
in 2007, and I would probably resist such a change as too late, even 
if I didn’t think it was a horrible idea.

I agree that we may have done a lot of the wrong kind of work on our
definition of guaranteed streamability. I believe the best approach
was that suggested some years ago by Dimitre Novatchev: a very very
small set of constructs, of which it is easy to convince oneself that
it is (a) streamable and (b) sufficiently expressive, and for which it
is easy to see whether a given stylesheet mode does or does not stick
to the subset.  There was resistance to this approach on the grounds
that it would be much better to require a less Spartan approach on the
part of the stylesheet writer, and it early became clear to me that
Dimitre's proposal would not command consensus within the group.  

Michael Kay has been kind enough to provide a concrete example of this
resistance, in comment 4 of bug 29984:

    From a usability point of view, I think most users would prefer
    stylesheets in category B to be streamed rather than to be
    rejected: otherwise they will want to know "why is it streamable
    if I write it this way, and not if I write it that way, when the
    two are obviously equivalent".

And so we have spent much of the last several years trying to reduce
the distance between the concepts of guaranteed streamability and
factual streamability as far as possible.  I have tried not to be
tiresome about it, but if I have not been clear that I think this has
been a mistake, then let me be clear about it now.  We would have done
better to adopt Dimitre's proposal for a tiny, tiny subset of XSLT (as
long as it was Turing complete) that anyone could see at a glance
(perhaps also with a little thought) was streamable, and to leave all
the clever stuff to implementations instead of trying to make the
definition of guaranteed streamability clever.

One danger is that we may find that we have shifted, in the words of
C.A.R.Hoare, from seeking a design in which there are obviously no 
problems to seeking a design in which there are no obvious problems.


If we want to have both a concept of guaranteed streamability (which
conforming processors are required to check) and a concept of
streamability in fact (which ambitious processors will wish to extend
as far as possible, and must therefore also check), then we are
essentially committed to doing two streamability analyses, and have
been ever since the WG invented the concept of guaranteed
streamability.

If the definition of guaranteed streamability and a processor's
definition (or implementation) of streamability in fact have a natural
relationship to each other, so that the boolean flag
'guaranteed-streamable?' is automatically calculated as a side effect
of deciding on streamability in fact, then it's easy to do both.  You
and Michael have been saying, as long as I have heard either of you
address the question, that this really isn't the case, and cannot be,
because so many parts of streamability in fact rely on internals of an
implementation that no one wants to write into the spec.

If the definition of guaranteed streamability is so simple that it's
easily calculable (whether it's part of calculating streamability in
fact or not), then again it's easy to do both.  We have spent the last
several years ensuring that the spec's definition of guaranteed
streamability is not in fact that simple.

I understand MK to be saying, in his response to Anders Berglund
today, that both of these situations hold: guaranteed streamability
does not have a simple relation to streamability in fact, and it's too
complicated to calculate trivially on its own.

I agree that that's an unfortunate state of affairs, but I don't
understand how the implementors in the WG can be surprised by it: it
wasn't Dimitre or other members of the WG who have contributed most to
making the definition guaranteed streamability more and more
elaborate, and less and less easy to calculate independently of an
implementation's internal rewriting mechanisms.  

I can see several possible resolutions:

1 We adopt the proposal in bug 29984.  The definition of 'guaranteed
streamability' no longer defines a class of stylesheets or constructs
which conforming processors must recognize (and signal deviations
from).  WG members like me, who feel responsible for user interests,
either hold our noses or leave the WG.

2 We decide that we cannot require processors to report non-GS
constructs in streaming contexts, and we accept that under those
circumstances there is no useful notion of 'interoperability' for
streaming processors.  We remove the definition of guaranteed
streamability from the spec, since it does not work, and we tell users
"We have added a number of constructs intended to help processors
process stylesheets in a streaming way.  The behavior of any processor
may vary; actual ability to stream a stylesheet is completely
processor-dependent."

Removing the GS rules will require a significant editorial effort, so
the likelihood is that the result of such a decision is a disaster, as
the WG runs out of resources before finishing our work.

3 We retain the status quo.

3a Implementors grumble but conform; they express their disapproval by
giving the user flag for reporting non-GS constructs names like
--pedantic-streamability-mode.

3b Implementors refuse to conform, since they think the effort of
checking GS too high and see no benefit at all.  The upshot is that
there are no strictly conformant XSLT 3.0 processors for the
foreseeable future.

Of course other possibilities exist:

4 We deadlock, and eventually the WG dissolves without bringing 3.0 to
Rec.

Of these, my favorite is 3a; if that’s unachievable, I’m currently undecided 
over whether I would prefer 2, 4, or 3b.

********************************************
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
cmsmcq@blackmesatech.com
http://www.blackmesatech.com
********************************************
Received on Thursday, 8 December 2016 01:52:06 UTC