Minutes for XProc WG telcon of 22 Dec 2005

Draft minutes are now available:

   http://www.w3.org/2005/12/22-xproc-minutes.html

Text copy below:

W3C[1]

                                   - DRAFT -

                            XML Processing Model WG

22 Dec 2005

   Agenda[2]

   See also: IRC log[3]

Attendees

   Present
           Norm, Rui, AlexM, Paul, Henry, Erik, Andrew, Michael

   Regrets
           Jeni, Richard, Alessandro

   Chair
           Norm

   Scribe
           Norman Walsh

Contents

     * Topics
         1. 1. Administrivia
         2. Administrivia
         3. 1.1. Accept this agenda
         4. 1.2. Accept minutes from the previous teleconference
         5. 1.3. Next meeting: 5 Jan 2006. (No meeting 29 Dec 2005.)
         6. 1.4. Tech Plenary registration is now open
         7. 2. Technical
         8. 2.1. Use Cases
         9. 2.1.1. From Alex
        10. Use Cases
        11. 2.1.2. From Andrew
        12. 2.1.3. From Jenni
        13. 2.1.4. From Norm
        14. 2.1.5. From Rui
        15. 2.2. Requirements
        16. Any other business
     * Summary of Action Items

     ----------------------------------------------------------------------

   <scribe> Scribe: Norman Walsh

   <scribe> ScribeNick: Norm

   Date: 22 Dec 2005

   <AndrewF> ??P7 is AndrewF

  1. Administrivia

  Administrivia

  1.1. Accept this agenda

   Accepted

  1.2. Accept minutes from the previous teleconference

   Accepted

  1.3. Next meeting: 5 Jan 2006. (No meeting 29 Dec 2005.)

   Regrets for 5 Jan: None

  1.4. Tech Plenary registration is now open

   URI for registration: http://www.w3.org/2002/09/wbs/35125/TP2006/[4]

   Rui: Not sure if I can be present. I'll let you know when I am.

  2. Technical

  2.1. Use Cases

  2.1.1. From Alex

  Use Cases

   Alex: Use cases:
   http://lists.w3.org/Archives/Public/public-xml-processing-model-wg/2005Dec/0011.html[5]
   ... I sent another version this morning. [Scribe notes it isn't in the
   archives yet]
   ... I'm using these pipelines for math, but they aren't specific to math.
   ... I'm using them for web serves; tagsoup is injected to turn random HTML
   back into proper XHTML

   <PGrosso> I see Alex's email at
   http://lists.w3.org/Archives/Public/public-xml-processing-model-wg/2005Dec/0021.html[6]

   Alex: Link I sent this morning includes pointers to onlin eexamples

   Thank you, PGrosso

   Alex: One of the components of my pipeline is the ability to pick out a
   subtree
   ... Important technical feature is the ability to one transform create a
   piece of markup that pointed to another transform.
   ... pipelines are sometimes embedded in the source documents

   Norm: Did I understand correctly: you have a pipeline where one step
   generates the pipeline that's used for subsequent steps

   Alex: No, they generate data.
   ... your pipeline is what it is, in the apply-xslt example, it's XSLT that
   decides what needs to be done next
   ... what you really want to do is decide based on data what needs to be
   done (including possibly making up what needs to be done on the fly)
   ... In cocoon you do this by redirecting to a new pipeline
   ... Each step can generate data that might cause subsequent stages to do
   something
   ... The interface to the tides example is a screen-scraping example; the
   component (using tag soup? --scribe) extracts data from a web page to find
   the tide data
   ... That's two pipelines that work together

   Michael: we shouldn't rathole here, but it's not clear to me how the
   facilities that Alex is talking about are and are not feasible in coccoon.

   Hopefully this can be clarified in email

   Alex: You can do a lot of these in coccoon, but they have a simple
   one-level sequence path and I've got a more hierarchical model. Processing
   subtrees is like having embedded pipleines. That's hard to do in Cocoon
   because of their syntax.

   Alex explains that the pipeline steps aren't dynamic but, for example, the
   selection of a particular stylesheet in a transfomration step might be
   data driven

  2.1.2. From Andrew

   Andrew: I put two simple cases, but I'm trying to make a couple of points.
   ... First, conditionality is required. Second, we'd like to have each
   componetn be independent, but sometimes we need to pass parameters from
   one stage on to the next.

   Norm: When I spoke about not needing to pass parameters, it was just an
   observation. I'm not surprised that sometimes its needed

   Alex: Can a particular step set a parameter for a later step?

   Andrew: There are user-set parameters when you invoke the pipeline, but
   there's no other kind of input.

   Alex: I have an example where stages can bind parameters for later stages.

  2.1.3. From Jenni

   Jeni isn't here, alas

   Alex: Jeni makes the point that some steps are made up of sub-pipelines

   Micheal: One thing that becomes very clear is that quite frequently you
   seem to have a choice of where to put certain kinds of functionality
   ... Conditional processing, for exampe, can be handled by choosing whether
   to invoke stylesheet a or stylesheet b or by writiing a stylesheet that
   checks a condition and then operates in mode a or mode b.

   Michael: We seem to get to choose whether to put the complexity in the
   pipeline or in the individual stages.
   ... in her use case, she imagined parsing the ... scribe was distracted

   <PGrosso> flanneling around?

   Michael: it's not always absolutely clear what the implication of moving
   the complexity around is

   Alex: putting all the complexity in a stylesheet makes the stylesheet hard
   to maintain

   Michael: That's one reason to let some of the complexity percolate up. But
   it's not clear how to balance those tradeoffs

   Alex: you can write extensions to XSLT and one of those could evaluate a
   pipeline

   Norm: I think it's a mistake to focus so exclusively on XSLT as there are
   other kinds of components

   <ebruchez> True

   ack

   <Zakim> ht, you wanted to point to step (1c)

   <MSM> alexmilowski: one way to think about these questions is to look at
   the kinds of extension functions people have written for XSLT 1 --
   sometimes those functions are there only because of deficiencies in the
   environment, and represent functionality that 'really' belongs in a
   pipelining language

   ht: Jeni's step 1c is clear about the fact that it merges the output from
   1a and 1b. Maybe Jeni's case is a little simpler than Alex's.
   ... It's clearly a requirement at some stage, though not clear that it has
   to be in V1

  2.1.4. From Norm

   <ht> Norm: Most of my examples are straightforward

   <ht> ... Two interesting wrt V1 or not

   <ht> ... First a sub-pipelining example similar to Jeni's 1c

   <ht> ... Second where a step produces an indeterminate number of e.g.
   chapter.html files, each of which has to go through further processing

   <ht> ... When you know exactly how many files will be output, it's clear
   how to do this in a fairly simple pipeline language, but when this isn't
   known in advance, not clear what to do

   <ht> AM: XQuery and XSLT2 are clear examples of this, which I've thought
   of in terms of thinking of the output as a sequence of documents

   <ht> ... This is parallel to the XPath-based viewport abstraction which I
   and others have been using

   <ht> EB: Problem with sequence is that the items aren't named

   <ht> ... Also, loss of symmetry, wrt names and/or cardinality, wrt inputs
   and outputs of steps

   <ht> NW: In my example they'd be named

   <ht> EB: In XSLT2 case, yes, but not in others. . .

  2.1.5. From Rui

   <ebruchez> ht, comment about sequences was from Erik

   Rui: mine are similar to previous examples.
   ... In the first use case, if you have an XSLT pagination, you'll create a
   huge set of documents. But the main document will not be used further
   ... should we allow the pipeline author to express this?
   ... in the next scenario the question is one of reuse and composition of
   pipelines
   ... should we use XInclude, or would we like to have another sort of
   language to express the composition
   ... If you have one pipeline with a component that outputs more than one
   document, and those documents are needed by the next component, how can
   you be sure that the right documents were generated?
   ... Do we need a way to specify that a certain number or kind of documents
   will be produced at runtime

   Erik: it looks like I did not get Rui's use cases. I got a blank email.

   Several other people had problems. The MIME seems to have been garbled.

   This is probably a consequence of a MIME message being forwarded by
   another client that does MIME

   Use cases from Erik:
   http://lists.w3.org/Archives/Public/public-xml-processing-model-wg/2005Dec/0020.html[7]

   Erik: I did not really connect those use cases to XPL which is the
   language that we have designed and a variation of which we submitted to
   W3C
   ... These are just use cases from actual clients
   ... I thought it would be useful to categorize the environments where
   these are processed.
   ... I found three broad categories: command-line/batch environments, web
   enviromnets, and service environments
   ... I'm just trying to see if there are requirements that haven't been
   mentioned, I don't think so, except perhaps the question of validation
   ... is validation a custom-component pipeline step or is it something that
   is part of the pipeline language
   ... Very often our use cases start by saying "we need an XML document";
   often they begin with a URI, but in some cases you can just consider
   passing a document to the pipeline itself, implying that the pipeline
   itself can receive and produce XML documents
   ... Looking at it this way, you can imagine that a pipeline interpreter
   might be a component in its own right.
   ... the second set of use cases involve conditionals. Our current thinking
   is that we do have many use cases that require conditionals.
   ... One of them is a conditional database access logic. Query a document,
   look at the content, if there's something then do an update otherwise do
   an insert.
   ... Anther scenario is content-dependent transformation; the
   transformation selected is determined by the output of a previous stage.
   ... Another example is where you want to generate a particular document
   for desktop or mobile browsers. The configuration from the outside
   determines which stylesheet (or pipeline? or pipeline stage? --scribe) is
   used.
   ... Another common use case is the selection of Atom or RSS1 or RSS2,
   etc., when generating feeds
   ... The next use case is a little different. Here an XML pipeline is used
   to implement an XML-RPC service. In the request you have method calls with
   method names. The sub-pipeline that's executed is determined by the method
   selected in the request.

   Norm: I wasn't sure at what level you needed to make conditional selection

   Erik: Based on whether an XPath expression returns true, you will execute
   a particular branch of the pipeline. Otherwise, you test another
   condition, etc. It's completely exeternal to the components.

  2.2. Requirements

   Erik: there are a few more use cases, that involve iteration.
   ... consider a collection of files on disk. You'd read a list of
   documents, either from a file or with a component that can scan the
   filesystem, you want to iterate on that list of docuents and for each
   iteration you want to perform a sequence of steps. Alternatively, you may
   want to combine all the results together.

   Norm: that sounds like a colleciton

   Erik: That's almost an implementation question. If your language supports
   multiple outputs in a dynamic way then maybe you can do that. But here the
   idea is that you want to perform a certain number of tasks, perhaps once
   for each element that matches an XPath expression.

   Alex: that sounds a lot like the concept of identifying subtrees in an
   infoset using an xpath
   ... I've been experimenting with another kind, where you have a document
   that contains 10 entries and points to hte next document with the next 10
   entries, etc. That seems completely different.

   Erik: we've identified two types of interation; one is a for-each another
   is a while.

  Any other business

   Norm: I propose we continue with iteration next week. And begin looking at
   requirements.

   Proposal: for 5 Jan, everyone submit a list of possible requirements so
   that we can begin to select the ones upon which we have consensus

   Accepted.

   Norm wishes the group happy holidays

   Adjourned

Summary of Action Items

   [End of minutes]

     ----------------------------------------------------------------------

   [1] http://www.w3.org/
   [2] http://www.w3.org/XML/XProc/2005/12/22-agenda.html
   [3] http://www.w3.org/2005/12/22-xproc-irc
   [4] http://www.w3.org/2002/09/wbs/35125/TP2006/
   [5] http://lists.w3.org/Archives/Public/public-xml-processing-model-wg/2005Dec/0011.html
   [6] http://lists.w3.org/Archives/Public/public-xml-processing-model-wg/2005Dec/0021.html
   [7] http://lists.w3.org/Archives/Public/public-xml-processing-model-wg/2005Dec/0020.html

                                        Be seeing you,
                                          norm

-- 
Norman.Walsh@Sun.COM / XML Standards Architect / Sun Microsystems, Inc.
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged information.
Any unauthorized review, use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.

Received on Friday, 23 December 2005 14:53:01 UTC