- From: Norman Walsh <ndw@nwalsh.com>
- Date: Thu, 26 Apr 2012 11:12:03 -0400
- To: public-xml-processing-model-wg@w3.org
- Message-ID: <m2aa1y5xz0.fsf@nwalsh.com>
See http://www.w3.org/XML/XProc/2012/04/26-minutes
[1]W3C
- DRAFT -
XML Processing Model WG
Meeting 214, 26 Apr 2012
[2]Agenda
See also: [3]IRC log
Attendees
Present
Norm, Henry, Vojtech, Murray, Alex
Regrets
Jim
Chair
Norm
Scribe
Norm
Contents
* [4]Topics
1. [5]Accept this agenda?
2. [6]Accept minutes from the previous meeting?
3. [7]Next meeting: telcon, 10 May 2012, skip 3 May
4. [8]Review of open action items
5. [9]p:zip and p:unzip
6. [10]Debugging strategies
7. [11]Clustering
8. [12]Streaming and parallel processing
* [13]Summary of Action Items
--------------------------------------------------------------------------
Accept this agenda?
-> [14]http://www.w3.org/XML/XProc/2012/04/26-agenda
Accepted.
Accept minutes from the previous meeting?
-> [15]http://www.w3.org/XML/XProc/2012/04/19-minutes
Accepted.
Next meeting: telcon, 10 May 2012, skip 3 May
Accepted.
Review of open action items
A-213-10: Completed
A-213-11 to A-213-14: Completed
A-213-15: Completed.
A-213-15 - A-213-18: Completed.
Vojtech: For XSLT 1.0, the only option is to write to a file, we're
explicit about not having documents appear on the secondary output port.
A-213-09: Completed.
p:zip and p:unzip
Norm: I move we postpone this until Jim can be present.
Accepted.
Debugging strategies
Murray: I'm trying to figure out two things: what sorts of mechanisms
would be useful in the language to assist with debugging, and what sort of
things are already there?
... There's p:log, two implementations have a "message" step, but I'm
wondering about the possibility of other kinds of steps.
... I've had this discussion in the past with C programmers and now I'm
talking with XProc programmers.
... I put some steps in the requirements document: one to turn on
debugging, one to turn on tracing. I highlighted that there are some
functions that can give you information about your environment.
... I wonder about strategies ... logs, etc.
... It seems like those are the sorts of things you might expose.
Norm: There are two things you can do: get a dump of the graph and get
more verbose logging.
Norm waxes poetic about -D and Java logging.
Vojtech: We have something similar. We have profiling output. And also we
have a detailed trace of the pipeline: what documents were passed, what
were the options and variables, etc.
... I wonder if we should try to standardize this.
... In other specifications by XQuery, there's a trace function but the
rest is implementation defined or dependent.
... In my view, we have p:log which is rather inflexible and you can have
a message step. But the problem with this is that it requires you to
modify the pipeline and potentially break the sequence of steps. Sometimes
you have to do ugly plumbing to keep the original sequence.
... Maybe what we could consider is some sort of construct like group or a
wrapper that would log some information without having to add pipe
bindings to keep the pipeline in the original sequence. A construct that
doesn't influence the connections between the steps would be nice.
... Instead of a message step. Or we could have both. We could have a
trace element that wraps a bunch of steps and does logging, but it
wouldn't be a step.
Norm: Yes, we could invent a new kind of thing, but I wonder if this is so
implementation dependent that it's of limited value.
Vojtech: Like p:log, I think we could leave the details implementation
dependent. The trace wrapper might be something like the resource manager
that we discussed in the past.
Norm: Yes, if we invent a new kind of wrapper for the resource manager,
maybe we could leverage that for the trace wrapper.
Henry: Oh, I'd rather not. You really shouldn't have to edit your pipeline
to do this.
... Maybe a wrapper is the best we're going to come up with. For something
like the resource manager, a wrapper is more appealing because it's a
feature of the design of the pipeline. Whereas, tracing and profiling are
not part of the pipeline.
... So I'd rather not have something in the pipeline.
... We can't just leave this to implementors, the way the python or lisp
debuggers do, because you can't implement XProc in XProc.
... A different way to talk about this in the same spirit would be to say
that we already have ways to name things. Maybe we want to think about
this in a sort of meta way: we want to think about ways of annotating
pipelines, externally even, in order to describe tracing or profiling
behavior.
... We could have a trace descriptor and a pipeline.
Alex: This could be done if you had a description of the binding for the
pipeline.
... That would require the ability to point at a chunk of pipeline not
individual steps.
Vojtech: We could do somethign similar to XQuery 3 with annotations.
Norm: It seems like some sort of "trace only these named steps" feature
might be useful.
Henry: I know that there was at least some work in actually doing just
what I dismissed: as far as the engine is concerned all you can say is
instrument yourself. Where you put the enegy is in the tool that presents
the output to you. So instead of trying to say only give me trace
information for the last four steps, you just turn on tracing.
... Then the tool only shows you the output for only the last four steps.
Norm: Yeah. Fair point. My tracing is all adhoc.
Murray: So we could imagine an XProc pipeline that read the trace output
and presented it in a nice way.
... I've heard the argument before for putting all the tracing outside the
program. I've heard the same argument about documentation too.
... One of the things I've noticed as I'm gathering these requirements is
a section called "Integration".
... A lot of these requirements in the areas of debugging and testing and
error handling are related to integration. All of these things can be
aided by leaving sign posts in your program. If you know that you're
having a problem in a certain area of the program, then leaving the
indicators in there and being able to flag the pipeline could be very
helpful.
<ht> Hmm -- I absolutely agree that documentation is an integral part of a
program or pipeline
Murray: You can run your pipeline 24 hours a day and diff the traces, look
for differences, etc. This just seems useful from a Q/A audit perspective.
Alex: My question is, can I write a pipeline that's normal and reasonably
minimal and still debug the thing?
... Could I profile, debug, etc. without having to touch the pipeline?
Norm: I think with an appropriate debugging environment you could.
Vojtech: Yes, but some steps are in libraries that can have the same
names, etc.
Murray: I don't care what anybody does with respect to designing a
debugger that can look into an XProc program and debug it. More power to
them. But that's not what I want to discuss. We're talking about
requirements for the language.
Henry: I hear you, the way I hear this conversation going so far is that
no body has come up with any.
Murray: No. Several people have made suggestions, but we keep coming back
to "I want to do this from outside my pipeline"
Henry: Putting things in the language requires that implementors support
them. I think the argument that I would make isn't that my program is
sacred, but rather are we sure enough of the value of in-language support
that we want to require everyone to do the work that's necessary.
... It's the cost-benefit analysis that comes first.
Murray: Here's a simple question: if a processor has the ability to turn
on trace, then providing some markup that advises that processor that this
is a good time to turn on trace, would be useful. And if the processor
can't turn on trace, then it's harmless.
... I don't want to specify what comes out in the trace, though we might
want to give some advice, but that's up to the processor.
Alex: I guess the conundrum as I see it is that we don't have any
debuggers yet. And we have very minimal tracing and debugging support.
... I suspect there are things we should do but I don't think I know what
they are.
Murray: Well, Norm said he output trace information...
Alex: Yes, but that's very primitive compared to other languages. Do we
have the right naming conventions, for example?
Murray: We decided, early on, that there would be a "stderr" port. Could
we not designate a port for trace output?
... I just want to look for some things that would make the language
easier to debug.
Vojtech: We already have p:log, but it's very primitive. Maybe we should
just make p:log more flexible and useful; allowing it in options,
variables, input ports, etc. Then with a processor switch, you could
enable the log statements you wanted to trace.
... It could wind up in one location. Maybe we don't have to add anything
new, just improve existing features? Maybe we could imagine a switch to
magically insert p:log statements everywhere. The advantage of the log is
that it doesn't change the sequencing of steps.
Norm: We could do that. The only thing that occurs most obviously to me
would be a standard message step.
Vojtech: It's definitely useful, but it's tedious to add 10 of them.
Norm: Yes, it's tricky, but is still perhaps useful enough to standarize.
Vojtech: Maybe with a switch to disable the output.
Henry: Yes, I think that might be worth looking at standardizing that.
Maybe we could add classes so that you can enable them or disable them by
name. It would be nice to be able to turn them off without having to edit
them out.
Alex: I'm looking at p:log. First a question: If I don't have an href or
if I use the same href, what happens?
Implementors mumble a bit
Alex: It would be nice if there was some metadata on the output so that I
could reconstruct what happened later. A notion of what port this was
produced from, when it arrived, etc.
... Similarly, it might be nice to log inputs.
Vojtech: Absolutely.
Alex: It would be nice to be able to put assertions inside the p:log step.
... Is this XPath expression true?
Vojtech: The ability to construct a message with an XPath expression would
be useful.
Alex: Those are the sorts of things that would be useful.
... You could have one big log file with all the data in it; then you
could examine that output.
Murray: So one of the things we could consider is whether every step would
have a verbosity level and basically if you had high verbosity turned on,
then that step would report some things when it started.
... We could rationally talk about what those conditions might be.
... Speaking of which, I've listed a lot of functions in the use cases and
requirements document. It might be nice to have an exhaustive list.
Norm: Where's the list?
Murray: F.5.12
Norm: That's a mixture so I'm confused.
Murray: Yes, it's a mixture, but they return information about the current
context or environment.
... All of this is useful information that you can use in debugging. Years
ago, working in troff, I got some debugging built in. We had levels of
verbosity and I could set the warning/error etc. messages. I could print
messages at the beginnings of loops, I could turn trace on in the middle
of a loop, etc.
... I found this useful at the time.
Norm: Yes. I can see that.
... Of the things we've discussed today, I think the proposal to extend
p:log so that it can contain messages or assertions and the ability to log
inputs seems like the best combination of utillity and low hanging fruit.
<scribe> ACTION: Norm to sketch out an extension to p:log with messages
and assertions. [recorded in
[16]http://www.w3.org/2012/04/26-xproc-minutes.html#action01]
Clustering
Murray: Who's baby is clustering?
Norm: What do you mean by clustering?
Murray: Good question. I found an input along the lines of "does XProc
need clustering?"
Norm: In the doc?
Murray: Yes, F.3.3
Henry: Is this group-by?
Some discussion of where the requirement came from and what it means
<Vojtech>
[17]http://www.w3.org/wiki/index.php?title=Integration&diff=55046&oldid=55034
Streaming and parallel processing
Murray: Alex and I have noted some language along these lines in the first
requirements and use cases document that didn't make it into the spec.
... But it's never clear what streaming and parallel processing mean in
concrete terms.
... How have we impeded or assisted parallel processing?
Henry: Parallel processing is a little easier. What I think we meant is to
never constrain parallelization
Henry: Make no assumptions about evaluation order that aren't required by
explicit connectiviiy.
Henry: The way I used to say it was: it ought to be possible to implement
an XProc processor by starting each step in a thread an waiting to see
what happens. Someone has input, everyone else is blocked, and each step
works as input arrives.
... For example, there's nothing today that says that the steps at the
bottom of a pipeline have to run after the ones at the top.
Murray: for-each says the step must produce output in the right order.
Does that have an impact on parallelism?
Norm: On streaming more than parallel processing.
Alex: It might be nice to add annotations to a pipeline to say what the
streaming/parllelism expectations are.
Murray: I was puzzled by a request to allow for-each in an unordered way
Henry: Yes, this connects up to unordered collections. Right now we have
sequences, but if we had collections, then you could have a switch on
p:for-each that said it was allowed to be unordered.
... Then the question is, what does a step that takes an unordered
collection as input look like?
<scribe> ACTION: Norm to put streaming/parallel processing on the agenda
for two weeks [recorded in
[18]http://www.w3.org/2012/04/26-xproc-minutes.html#action02]
Norm: Adjourned
Summary of Action Items
[NEW] ACTION: Norm to put streaming/parallel processing on the agenda for
two weeks [recorded in
[19]http://www.w3.org/2012/04/26-xproc-minutes.html#action02]
[NEW] ACTION: Norm to sketch out an extension to p:log with messages and
assertions. [recorded in
[20]http://www.w3.org/2012/04/26-xproc-minutes.html#action01]
[End of minutes]
--------------------------------------------------------------------------
Minutes formatted by David Booth's [21]scribe.perl version 1.136 ([22]CVS
log)
$Date: 2012/04/26 15:08:47 $
References
1. http://www.w3.org/
2. http://www.w3.org/XML/XProc/2012/04/26-agenda
3. http://www.w3.org/2012/04/26-xproc-irc
4. http://www.w3.org/XML/XProc/2012/04/26-minutes#agenda
5. http://www.w3.org/XML/XProc/2012/04/26-minutes#item01
6. http://www.w3.org/XML/XProc/2012/04/26-minutes#item02
7. http://www.w3.org/XML/XProc/2012/04/26-minutes#item03
8. http://www.w3.org/XML/XProc/2012/04/26-minutes#item04
9. http://www.w3.org/XML/XProc/2012/04/26-minutes#item05
10. http://www.w3.org/XML/XProc/2012/04/26-minutes#item06
11. http://www.w3.org/XML/XProc/2012/04/26-minutes#item07
12. http://www.w3.org/XML/XProc/2012/04/26-minutes#item08
13. http://www.w3.org/XML/XProc/2012/04/26-minutes#ActionSummary
14. http://www.w3.org/XML/XProc/2012/04/26-agenda
15. http://www.w3.org/XML/XProc/2012/04/19-minutes
16. http://www.w3.org/2012/04/26-xproc-minutes.html#action01
17. http://www.w3.org/wiki/index.php?title=Integration&diff=55046&oldid=55034
18. http://www.w3.org/2012/04/26-xproc-minutes.html#action02
19. http://www.w3.org/2012/04/26-xproc-minutes.html#action02
20. http://www.w3.org/2012/04/26-xproc-minutes.html#action01
21. http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
22. http://dev.w3.org/cvsweb/2002/scribe/
Received on Thursday, 26 April 2012 15:12:39 UTC