- From: Erik Bruchez <ebruchez@orbeon.com>
- Date: Wed, 01 Mar 2006 14:55:52 +0100
- CC: public-xml-processing-model-wg@w3.org
One minor comment: s/Eric/Erik Also, I don't know what my "In XPL everything is in scope" means, as that is not the case in XPL. I don't remember what I meant to say. -Erik Alex Milowski wrote: > > > XPL Presentation (See presentation) > > * Michael: (Clarification) > The 'infosetref' attribute represents the binding and the names > are internal to the component > The 'name' attribute is the formal parameter name. > > * Eric: the p:input and p:output declare the name of the inputs and > outputs that are used to invoke the process and handle the results > > * Norm: (Clarification) > It is the pipeline processor that looks at the inputs and outputs? > > * Eric: the inputs and outputs are evaluated in a lazy fashion and it > back-chains through the steps which eventually leads to the input of > the pipeline. > > * Alex: (Claification) > How does back chaining work with conditionals? > > * Eric: The output of conditionals needs to have the same infoset name. > > * Eric: XHTML example (use case 5.15: Content-Dependent > Transformations) > - one of the use cases. > - one of the steps rewrites the QNames for presentation in IE > - one of the steps deals with HTML serialization > - the output for serialization uses an internal root element node > for representation of text and binary (character encoded) > > * Eric: Iteration example: > - lets you iteration over an document via xpath expression > - the current() function gives you the current item being > iterated > - gives you the ability to process large XML document > * Murray: Does each of the steps have its own XML vocabulary (e.g. HTTP > serializer) > * Eric: Yes. > * Richard: Do they require their own namespaces > * Eric: No, but there it isn't required as it is contextual to the > component. Having another namespace adds declarations to the > document. > > GUI Tool Sub-thread: > > * Richard: Do you have a GUI tool? > * Eric: No. > * Richard: we should define the tool in terms of a graph > * Norm & Michael expressed concern with this as they wouldn't > want to require a GUI tool. That starting with a graph could > ignore the XML representation > > Norm's SXPipe: > > * http://norman.walsh.name/2004/06/20/sxpipe > > * Stages are executed in order. It is handed a DOM and returns a > DOM. > > * In example, skip attribute allows steps to be skipped. If statically > evaluated to true, the step isn't executed. > > * Impl: two methods: init & run. Init is passed the element that > represents the stages. 1700 lines of java. > > > (Alex's presentation here) > > > Richard's presentation: > > * I want to replace what we do today without a pipeline with an XML > pipeline. > > * lxgrep - produces a tree fragment (multiple root elements possible) > via an XPath > > * lxprintf - formats Xpath matches as plain text > > -e element For each element > > * lxreplace - replaces elements/attributes > -n Renames an element > > * lxsort - sorts elements by values identified by an XPath > > * lxviewport - runs a unix command on everything that matches an > element (like subtree in smallx, viewport in MT pipelines) > > * lxtransduce - ?? > > * want to make these pipelines more declarative so people can use them > without writing code. > > * XSLT is also available > > Rui Lupis: (see presentation) > > * APP: Architecture for XML Processing > > * Complex processing support for digital librarys - both developers and > producers > > * Always a need for some manual purposes. > > * Tiers: a set of pipelines woing on disjoint inputs > > * Pipeline: acyclic diagraph of processors > > * Processor: defined by a URI that differentiates an interface vs > implementation vs usage. > > * Processing language: > > Project: an RDF document > > Pipeline: mapped to a linear sequence of components > > Registry: An RDF document that registers components & their inputs > and outputs > > * Pros: > * Separation of concerns lets you interchange components without > touching the pipelines. > * Its an implementatin neutral language > * and others > > * Cons: > * No interation/test > * RDF based > * Doesn't support generation of XSLT styelsheets > * Doesn't support chunking > > * Thoughts: > * Good to have multiple levels of composition (not just xinclude) > * Indirection is good for batch processing > > Alex: The model is that you define a particular step in the registry > that is a binding, for example, of an XSLT transform to its > input+parameters > to its output. A pipeline then points to that step and the step > can be re-used in other pipelines. > > * If the registry changes, the pipeline doesn't have to change. > > > Infosets: > > Murray: > * stdin & stdout > * then there is parameters > * then there is the notion of input & output > * then there is the notion of an infoset on the side > * then there is the notion of artifacts > * e.g. on a server you might want to store things in a cache > > Norm: > * storing on a filesystem can be abstract to the idea that outputs > have a URI and a processor can decide to write them out to disk > if they want. Whether that happens isn't a relevant problem. > > Richard: > * It is quite likely an implementation will need to buffer things > if you have a pipeline that isn't just a straight line. > > Eric > * In XPL everything is in scope > > Richard: > * there is no guarantee that you read things at the same rate, so > you have to buffer > > Murray: > There's stiff an output being buffered & cached. As an output > you produce foo.infoset and later you consume foo.infoset, then you > need to store that. > > Eric: you could have a implemention that buffers things to memory or > alternatively to a disk cache if it is too big > > Murray: > Before today, I was thinking this was like a unix pipe. > They could be bringing in separate things, but there is still just a > pipeline. > Most things talked about today don't seem like pipelines. > > Richard: > My stuff is a unix a pipeline.... but that's "just an implementation > hack" that uses shell programming. > > Eric: The reason you want to serialize is? > > Richard: Because I have a bunch of programs that run on files. I want > a language that I can still compile to scripts that serialize to > files. > > There are other things that things like schema validation might do > that may not be able to be serialized > > MSM: It is possible to define a non-standard PSVI serialization > > Eric: You can always do this by wrapping components that always > serialize > > Norm: > * there are simple components where one documen comes in and one > goes out > * there are other ways to thing about things like XSLT: > - there is one input and an ancillary input (the stylesheet) and > one output > - but this isn't always fixed > > Alex: Having a primary input is necessary for streaming > implemenations. > > Murray: In what case is that there is the stylesheet the input > > Norm: I have a report that is coming out and the report is always the > same (the input document), but the XSLT is what is generated by > the pipeline. > > MSM: Why is there emphasis on backward chaining? > > Eric: (diagram on chart w/ parallel steps that start from the same > start and are aggregated at the end) > > Back chaining is because a step can optionally decide not to get an > input. It isn't that easy to understand from a user. > > Specifying order is natural and is a problem. Users do have > problems with [controlling] order You have this problem with XSLT > > Richard: what drives things in XSLT is apply-templates--and that is > not backward chaining. > > parallel paths are the 1% case > > Alex: There is a whole body of knowledge that deals with network flows > and we should be in compliance with those known concepts and > algorithms. > > All: [to alex] You're going to have to prove that you need stdin for > optimization. > >
Received on Wednesday, 1 March 2006 13:58:28 UTC