- From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
- Date: Mon, 13 Mar 2006 20:22:08 +0000
- To: XML Processing Model WG <public-xml-processing-model-wg@w3.org>
- Cc: Norman Walsh <Norman.Walsh@Sun.COM>
[Second attempt - these were first sent 8 March, but have not shown up in the archive or in anyone's inbox. I don't know what the problem is but will try to diagnose it.] My apologies for the late arrival of these notes from last Tuesday afternoon. At a couple of points, my hands became jittery from the discomfort of typing on my laptop keyboard, and I took notes on paper. When I searched for those paper notes this morning, I failed to locate them; the lacunae are marked explicitly. My apologies for this, too. -CMSMcQ [Tuesday 28 February 2006, Mardi Gras, afternoon session.] [2:00 p.m.] Henry Thompson spoke about the (unreleased) Markup Technology Pipeline Language. It has a simple GUI for building a pipeline of processes; built-in processes include processes to absolutize URIs, eliminate elements, filter elements, call arbitrary programs which map XML to XML, wrap subtrees in particular elements, check URIs, send and receive SOAP messages (synchronously), perform XInclude processing, execute an XSLT 1.0 transformation sheet. One XML representation of the pipeline uses a language very similar to that of Sun's pipeline language. (It does add some built-in semantics, namely stdin and stdout, which are not present in the Sun pipeline language.) But those pipelines are not runnable; there is a compiled form which is much more highly articulated and somewhat less readable. The compiler itself is an eight-stage pipeline (not a separate set of Java classes). Two interfaces are provided; event streams or full documents. The objects provided are the same where possible: a start-tag event looks like an element with no children. "Viewports" are an important tool: select elements, pass them through part of the pipeline, and reassemble the original document, replacing the original elements by their images. Examples: validate twice with surgery. http://www.markup.co.uk/showcase/ HT identified as key points: (1) separation between UI, the language users see, and the language the pipeline processor actually executes; (2) a resource manager which is largely an optimization issue but does mean you can write straight-through pipelines with appeals to resource manager which would otherwise require multiple inputs; (3) the two-level story allows a clean way to say component design is independent of push vs. pull or tree vs event -- the compilation process and the runtime take care of mismatches. Q: If you didn't have both push and pull, would you not need segments? A: The end-viewport component has two inputs, so I need both. [Some discussion lost here; apologies from scribe.] Topic: iteration NDW: does it suffice to say for v1 that you can iterate a pipeline over the sequence of documents, but cannot iterate to a fixed point, and cannot iterate for some fixed number of repetitions? RT: how about an implicit iterator? A component that takes a single document, when presented with multiple docs, runs on each document and produces a sequence of docs. (At this point, someone says something under their breath about mapcar.) MM: what is a "viewport"? HT: it's a pipeline stage that allows you to identify a set of nodes, apply a process to each of them, and produce as a result document the original 'matrix document', with the results of the process substituted for the original selected elements. We discussed alternate names for this kind of construct: peephole, subtree, bypass. EB (responding to RT): I'm wary of putting too much emphasis on implicit aspects. If you want to iterate, use an explicit iteration construct. Iteration can be handled either by a specific component, or by built-in language-level constructs. We discussed. We digressed into a discussion of the analysability of ad hoc constructs vs built-ins; macros vs special forms (fxpros). NDW reiterated his proposal: in v1, iteration would be limited to iteration over a sequence of documents. (This is intended without prejudice to the presence or absence of viewport/subtree/peephole/bypass.) HT: yes. MSM: I thought we had requirements for iteration to a fixed point? Didn't we discuss them just the other day? A: yes, we discussed iteration to a fixed point the other day. But no, it has not been accepted as a requirement, at least not as one we are committed to achieving. There followed a discussion of what 'requirements' means. We have a set of requirements, some of which we will and some of which we won't meet. Or we have a set of candidate requirements, some of which we will accept as actual requirements and others of which we won't. There was some concern over NDW's proposed restriction: AM doesn't want to lose viewports through inattention. Non-use-case use case: the Atom feed which just gives you a bit at a time, with a link to Next bit, will be hard (impossible?) to handle without iteration to a fixed point or iteration under control of some Boolean condition. [Coffee break here] [When the scribe returned from , discussion of parameters underway. One point to record: if we use any one of the schemes proposed for dynamic parameters, it doesn't preclude the use of simple ways of specifying static parameters.] NDW noted: We've discussed conditionals, iternation, resource managers vs pipes; what else is there near the top of people's lists? AM: viewports? EB: sub-pipelines? there are issues, maybe just issues of detail. But we haven't agreed on how they connect up. And we haven't agreed on a processing model. Backward, forward, other? HT hypothesized that if we can come up with a way of describing the semantics of the pipeline language that didn't require us to take a stand on the question of data flow model vs dependency model, it would be a win. AM (another topic to discuss): XDM data model vs infosets. Topic: infosets vs data models vs ... The only problem AM has with XDM is (a) its weird treatment of invalid material and (b) its incomplete access to the PSVI. HT agreed. Also, the XML Schema WG did its work by adding infoset properties; others may do the same. MSM asked: how does that distinguish the infoset from XDM? There followed a discussion (inconclusive) about whether XDM is closed. [Discussion lost by scribe here, apologies ...] Would it cause problems if we said that what gets passed around are XDMs? Technical problems? Political problems? Some voices say Richard can't do what he wants in that case (i.e. the language cannot be implemented by using pipeline stages which each read and write XML serial syntax and run in Unix pipelines). Richard was not convinced: if all you have a pre-defined components, and you have some restrictions on what is required, then you won't necessarily be able to tell the difference. HT: what would the conformance clause say, if we wanted that? I can't see what people want. [There followed an excursion into the formulation of conformance clauses.] At 5:30, Norm provided a concluding summary and evaluation of the meeting and we adjourned.
Received on Tuesday, 14 March 2006 07:46:21 UTC