Re: Static streamability, regarding bug 29984 from Carine Bournez on 2016-12-08 (public-xsl-wg@w3.org from December 2016)

From: Carine Bournez <carine@w3.org>
Date: Thu, 8 Dec 2016 09:26:48 +0000
To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
Cc: public-xsl-wg@w3.org
Message-ID: <20161208092648.GO22672@people.w3.org>
[trying to sum up and stick to what I thought was the most important,
it may still be too long and quoting too much, sorry]



On Wed, Dec 07, 2016 at 06:51:28PM -0700, C. M. Sperberg-McQueen wrote:
[...]
> > The current rules are:
> 
> > A) If any construct falls within set (1), it is *always*
> >    guaranteed-streamable, regardless of processor.
> 
> I would go slightly further:  it???s not just that every processor
> will agree that it falls into the class ???guaranteed streamable???,
> but also that every conformant streaming processor is
> guaranteed to process it in a streaming way.

GS is a sort of contract between the user and the processors.
This is actually the most important point to keep in mind for the decision 
on the bug, because if we break that, then GS is completely useless.


> > B) If any construct falls within set (2), some processor *may* be able
> >    to process it in a streamable way
> 
> > The issue in Bug 29984 defines a third set, by two definitions:
> 
> > 3a) the set of constructs that by static rewriting is
> >     guaranteed-streamable
> > 3b) the set of constructs that by static analysis never accesses a
> >     streamed node
> 
> > In the bug report Michael Kay showed an example of a construct that is
> > not an expression but is still trivially streamable, yet not by our
> > rules unless we allowed (3a). 
> 
> I think you mean it is trivially streamable, but not guaranteed
> streamable under our rules.  Unless we have changed our design
> radically while I was not looking, our rules never say that something
> is not streamable.

Our spec never says that something that is not GS is not streamable, indeed,
so I don't think the status quo is unfair to users.
BUT
Adding a rule to extend the initial contract without breaking it is also 
a possible route.

> > The rule of (3b) is currently in our
> > spec in some places (i.e. on axis steps). An (extreme) example is
> > (foo, bar)[0], which always returns the empty sequence.
> 
> > If we were to accept either (or both) these rules we can say:
> 
> > C) If any construct falls within set (3a) or (3b) it is *always*
> > guaranteed streamable, but it is processor-dependent whether this is
> > detected
> 
> I think that's true; we could say that.
> 
> The problem is that in that case I no longer see the point of defining
> the class of guaranteed streamable constructs.
> What kind of "guarantee" is it, if processors are not guaranteed to
> stream the construct in question?

I don't think it should be necessarily in the "same" contract as the GS 
constructs.

GS means interop and what cmsmcq seems to be the minimum condition for
portability (that particular goal is a possible cause of disagreement in 
the WG, some may not see it as a portability guarantee...)

Proposed rules 3a and 3b are still interesting for the user, but don't 
qualify as GS as soon as the guarantee is not there (may not work on other
processors).

[...]
> In that tradeoff, I am unapologetically on the side of the user.  I do
> not believe that users are well served by invisible dependencies on
> specific implementations.

However, in this case "dependencies" on the implementation might not be
that "invisible".

[...] 
> > Extending our rules this way still allows the interoperability given
> > by sets (1) and (2). And users that want their stylesheets to be
> > guaranteed streamable should simply stick to set (1). 
> 
> My difficulty with this advice is that in my view of the world, my
> ability to stick to set (1) is a consequence of the rule in the spec
> that ensures that if I have any doubt about whether a given construct
> is streamable, any conforming processor which supports the streaming
> feature is in a position to tell me whether it is or is not guaranteed
> streamable.  

That should still be the case. Adding 3a and b should not alter the 
GS evaluation of other cases (i.e. not breaking the current contract
between users and processors)

> > Should they
> > consciously use set (3a/3b)? I don't think so. But if they
> > unconsciously happen to be in that area, we don't require a processor
> > to interrupt processing and raise an error.
> 
> It seems to me that this means that if I use processor A to check my
> streaming stylesheet, and get no errors about constructs not
> guaranteed streamable, then processor B is no longer guaranteed to
> process my stylesheet in a streaming way.  

Processing a construct and reporting it as GS are different things.
Otherwise, why do we have a notion of GS in the first place?

[...]
> I believe that the crucial leeway to offer here is the ability to
> stream anything if you can figure out how to stream it.  We have
> worked hard to make it clear that implementations have that freedom.

I think we still all agree on that.

[...]
> The point at issue is merely whether, given that it is not guaranteed
> streamable, a processor should be required to report to the user that
> the expression in line so-and-so is not guaranteeed streamable.  As a
> user, I find such reports helpful; they help me avoid being locked in
> to a single implementation.  
> 
> Since you tell me that they are a burden on an implementation, I'll
> believe you.
> 
> But two observations make me think that the burden imposed on
> implementors by the status quo is not all that heavy.

That's a question for the implementors. Is the status quo manageable?

[...]
> Second -- in the worst-case scenario, what would be involved in
> conforming to the status quo rule?  I imagine several approaches are
> possible.  (1) When the --report-gs-violations flag is set, certain
> rewrite rules are rendered inactive, since you have established that
> they can hide GS violations.  This might be inconvenient, if it makes
> it impossible for the rewrite rules to reduce every expression to an
> expression in the kernel syntax otherwise used by the implementation.
> (2) When the --report-gs-violations flag is set, you do
> *streamability* analysis once for real on the rewritten expressions
> and you do *guaranteed streamability* analysis in a separate pass over
> the user's expresssions, for the sake of the flag, cursing under your
> breath all the while at the waste of effort.  In the extreme case, you
> call an external program to do the guaranteed-streamability analysis,
> and otherwise ignore the flag.

That proposal of using an external tool baffles me. To me, it is the
same as trying the code on a separate processor to see if it works there
too. On the user POV, it does not bring a lot. If GS reports are useless 
because in most cases all processors will rewrite and process anyway,
then either 1) we are in a particular corner case where the user has 
written very wierd constructs 
or 2) GS is too restrictive and we have failed to make it useful after all.

So I tend to think that the current debate is to evaluate if we are in
situation 1) or 2) 

Abel is proposing to extend GS with 
"construct that can statically be determined to never require access to a streamed node but that would otherwise not be guaranteed streamable according to the rules in this specification"

Mike Kay seems to advocate for a much more radical change, that is closer
to dropping GS entirely (actually, changing it to some internal concept to
the processor rather than a contract between the user and processors.)


-- 
Carine Bournez /// W3C Europe
Received on Thursday, 8 December 2016 09:26:57 UTC