XProc Minutes for 20 Apr 2006 from Norman Walsh on 2006-04-20 (public-xml-processing-model-wg@w3.org from April 2006)

From: Norman Walsh <Norman.Walsh@Sun.COM>
Date: Thu, 20 Apr 2006 12:49:45 -0400
To: public-xml-processing-model-wg@w3.org
Message-ID: <871wvsfaza.fsf@nwalsh.com>
See http://www.w3.org/XML/XProc/2006/04/20-minutes.html

W3C[1]

                                   - DRAFT -

                            XML Processing Model WG

20 Apr 2006

   Agenda[2]

   See also: IRC log[3]

Attendees

   Present
           Henry, Michael, Andrew, Mohamed, Murray, Alessandro, Norman, Alex,
           Richard

   Regrets

   Chair
           Norm

   Scribe
           Norm

Contents

     * Topics
         1. Accept this agenda?
         2. Accept minutes from the previous teleconference?
         3. Next meeting: 27 Apr telcon
         4. Issue 3117: Should parallel execution of step be allowed by the
            language?
         5. Issue 3118: Should an implementation of the language be allowed
            to perform caching?
         6. Any other business?
     * Summary of Action Items

     ----------------------------------------------------------------------

  Accept this agenda?

   -> http://www.w3.org/XML/XProc/2006/04/20-agenda.html

   Alessandro suggests taking item 2.5 before 2.3

   Accepted

  Accept minutes from the previous teleconference?

   -> http://www.w3.org/XML/XProc/2006/04/13-minutes.html

   Accepted

  Next meeting: 27 Apr telcon

   Any regrets?

   None given

   <scribe> ACTION: Henry to provide registration page for August f2f
   [recorded in http://www.w3.org/2006/04/20-xproc-minutes.html#action01[6]]

   <ht> http://www.w3.org/2002/09/wbs/38398/XProcFTF2/[7] is now listed as an
   open questionnaire for our group [this completes HT's action --scribe]

   <scribe> ACTION: Murray to provide local arrangements info for August
   (ETA: two weeks) [recorded in
   http://www.w3.org/2006/04/20-xproc-minutes.html#action02[8]]

   MSM: One prominent way to get to the meeting will be to drive. Can we add
   some questions about car pooling to the registration form?

   Murray: I'm thinking about that, I'll see what makes the most sense.

   Henry: Let's us a wiki for that instead

  Issue 3117: Should parallel execution of step be allowed by the language?

   -> http://www.w3.org/Bugs/Public/show_bug.cgi?id=3117

   Alessandro: This was raised in a call a few weeks ago.
   ... I don't know if we need to spend a whole lot of time on it. We
   probably don't want to add constructs to the language to control this if
   we can avoid it.

   Richard: I hope most of this falls out naturally. If we don't specify the
   order of execution where it isn't inevitable. That implicitly allows
   parallel execution. We don't initially have to say much about it.
   ... If you have two things that could be executed in parallel, maybe they
   will be. If you want to synchronize them, you have to provide some
   mechanism, such as reading a document that one is writing.

   Alex: I think we shouldn't disallow parallel execution.

   Richard: We shouldn't put anything in the language to accidentally prevent
   it.

   Norm: It sounds like we view parallel exec. just as an optimization.

   Richard: I take the normal unix pipeline as a model. If you have two
   processes running, nothing expresses the order except that if one is
   reading and one is writing, you can be sure the reader will block waiting
   for the writer.
   ... Another aspect is that any kind of streaming implies a certain kind of
   parallelism.

   Norm repeats summary.

   Richard: Not just features of the language, but also the way we describe
   the language. A processing model might have unintended consequences that
   prevented parallelism, we want to avoid that
   ... An example: we might say that the processing language as if it
   executed the components in top-to-bottom, left-to-right order which would
   be bad because it would imply that side-effects (if there are any) occur
   in a particular order.

  Issue 3118: Should an implementation of the language be allowed to perform
  caching?

   -> http://www.w3.org/Bugs/Public/show_bug.cgi?id=3118

   Alessandro: This is a specific question about a particular example.
   ... The stylesheet executed by the second step is executed by the first
   step.
   ... Should the pipeline engine be allowed to cache the stylesheet produced
   by the first step across invocations
   ... Can the engine be smart enough to determine that the output will be
   the same and reuse a cached value.

   Norm: I think that what an engine does is not our problem.

   Richard: The answer, in some sense, is obviously yes. If the engine can
   determine that the same results will be produced, then it can use the
   cached copy.

   Richard: What does it mean for it to be exactly the same? Vanilla XSLT 1.0
   stylesheets can't produce any side effects.
   ... but care must still be taken to assure that side effects don't happen
   ... We may need a way to allow authors to express that some components are
   side-effect free

   Alex: It would be interesting to consider annotating the steps
   ... You may be able to say "never cache" but maybe a smart impl could
   cache or not as it saw fit otherwise.

   Richard: There are some even simpler cases of caching. In the MT pipeline,
   we compile schemas and cache them. That means the same schema used in two
   places can reuse the cached copy.

   Alex: The more interesting case is where it's produced by the pipeline.

   <MoZ> alexmilowski, cache hints like expires in Cocoon ?

   Alex: The concept of a dynamicly generated schema isn't far fetched, but
   URIs that change everytime you read them could be problematic.

   HT: The http expires case isn't good enough. The MT engine checks using
   the http refresh if stale everytime anyone touches a cached resource
   because there's no way to count on pipeline time and internet time being
   similar.
   ... The actual time between two uses of a cached object may be wildly
   different from what you think they are. The only safe thing to do is ask
   the server each time.
   ... That works on a filesystem too
   ... I'm not sure how that works in the context of documents generated by
   the pipeline

   <Zakim> ht, you wanted to endorse the idea of annotation

   HT: I think that for practical reasons, I'd be very unhappy to see any
   requirement of no side-effects imposed on components.
   ... I think "escape to program execution" and "synchronous SOAP exchange"
   are examples of components that cannot have intrinsic gaurantees of no
   side-effects.
   ... There are also cases of components that do database updates. Those
   components have a side-effect.

   Alex: Those aren't (necessarily) examples of pipeline steps communicating
   through side-effects
   ... If you're going to have that synchronization problem, you'd setup a
   dependency for that.

   HT: I'm in favor of an approach which has a default and allows the
   component to assert the opposite.
   ... 1. Not side-effect free; even though my inputs are the same, you can't
   be sure I'll produce the same output and

   <MSM> [this seems to be an "i am not a function" declaration?]

   HT: 2. An expression of out-of-band dependencies.

   Norm observes that we've wandered into the issue of side-effects

   Norm: I think everyone will agree that if the pipeline knows the output
   will be the same, it can cache the result

   Alessandro: I'm not sure if this is too strong a statement. Consider the
   case of reading a stylesheet from a URI.

   MSM: Side-effects and caching are not seperable questions

   Alex: Caching is a feature of the implementation not the language

   Richard: Just a big switch will probably be too coarse grained.
   ... I imagine descriptions for each component type and the XSLT component
   might, for example, say that it has no side-effects by default. But then
   on a particular case, you could override it

   Alex: The thing that concerns me about being able to say a component has
   side-effect is that it isn't clear what that means.
   ... Does it really effect the pipeline running?
   ... Unless there's some dependency in the flow-graph, what can the
   processor do.
   ... Unless we do something like what XSLT does with the document()
   function, I'm not sure there's a great answer here.

   Norm: I wouldn't stream past a component that had side-effects

   Norm describes the case of a SOAP service

   Alex: You have a pipeline with five steps, each has an auxinput that calls
   this SOAP service.
   ... If you cache, you'll get one answer. If you don't cache, you'll get
   five answers.

   Norm will take the question to email

  Any other business?

   None.

   Adjourned.

Summary of Action Items

   [NEW] ACTION: Henry to provide registration page for August f2f [recorded
   in http://www.w3.org/2006/04/20-xproc-minutes.html#action01[11]]
   [NEW] ACTION: Murray to provide local arrangements info for August
   [recorded in http://www.w3.org/2006/04/20-xproc-minutes.html#action02[12]]
   **
   [End of minutes]

     ----------------------------------------------------------------------

   [1] http://www.w3.org/
   [2] http://www.w3.org/XML/XProc/2006/04/20-agenda.html
   [3] http://www.w3.org/2006/04/20-xproc-irc
   [6] http://www.w3.org/2006/04/20-xproc-minutes.html#action01
   [7] http://www.w3.org/2002/09/wbs/38398/XProcFTF2/
   [8] http://www.w3.org/2006/04/20-xproc-minutes.html#action02
   [11] http://www.w3.org/2006/04/20-xproc-minutes.html#action01
   [12] http://www.w3.org/2006/04/20-xproc-minutes.html#action02
   [13] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
   [14] http://dev.w3.org/cvsweb/2002/scribe/

    Minutes formatted by David Booth's scribe.perl[13] version 1.127 (CVS
    log[14])
    $Date: 2006/04/20 16:45:08 $
Received on Thursday, 20 April 2006 16:50:14 UTC