XProc Minutes 12 Apr 2007 from Norman Walsh on 2007-04-12 (public-xml-processing-model-wg@w3.org from April 2007)

From: Norman Walsh <Norman.Walsh@Sun.COM>
Date: Thu, 12 Apr 2007 12:14:33 -0400
To: public-xml-processing-model-wg@w3.org
Message-ID: <87bqhtefiu.fsf@nwalsh.com>
See http://www.w3.org/XML/XProc/2007/04/12-minutes.html

Meeting: XML Processing Model WG
Date: 12 Apr 2007
Agenda: http://www.w3.org/XML/XProc/2007/04/12-agenda.html
Meeting number: 63, T-minus 29 weeks
Chair: Norm
Scribe: Norm
ScribeNick: Norm

Present: Norm, Paul, Richard, Henry, Rui, Alessandro, Alex, Mohamed

# Administrivia

Topic: Accept this agenda?
-> http://www.w3.org/XML/XProc/2007/04/12-agenda.html

Accepted.

Topic: Accept minutes from the previous meeting?
-> http://www.w3.org/XML/XProc/2007/04/05-minutes.html

Accepted.

Topic: Next meeting: telcon 19 Apr 2007

Regrets from Henry.

Topic: Monitoring the comments list
-> http://lists.w3.org/Archives/Public/public-xml-processing-model-comments/

Topic: Error handling

Norm: I think there two things we need to address, first is the
semantics of try/catch, the other is what components raise errors and
under what circumstances. Does anyone think there's more to it than
that?

No one suggests there is.

Norm: First, we say that try/catch catches dynamic errors. Second, we
say that the components which raise errors, raise dynamic errors. Is
there more that we need to say?

Alex: The big cases I have to cover are you do a load or httpRequest
and it fails but you want to guarantee a result in the pipeline. This
would also catch XSLT dynamic errors.

Richard: What do we say about error messages?

Norm: T.B.D.

Richard: Some specifications, like XML Schema, say something about the
format of messages. It would be nice if we could capture that.

Alex: It would be nice to have a little vocabulary that's available to
the catch.

Norm: I was thinking of something very simple, like:

  <errors>
    <message type="QName">something</message>
    <error type="QName">something</message>
  </errors>

Mohamed: There's a standard for error messages, EARL. I like something
simple, but if we need something more complex, maybe we should look at
that.

Henry: Isn't that mostly about test suite reporting?

Mohamed: Yes, but it's not clear if they're going to attempt to take a
broader scope. DSDL's Schematron component also has a format for errors.

Henry: I think the thing we have to say is that it is in the nature
of the lack of constraint that we put on implementors that without
try/catch, you have no gaurantees whatsoever about how far or how much
information will come out of a component that subsequently fails.

Henry: I think that nobody downstream of a try-catch will see anything
until after they are gauranteed that no errors occur.

Henry: What happens if there's an error in the catch?

Norm: The pipeline crashes and burns.

Henry: Then you can't say anything about the atomicity of try/catch.
Unless we buffer catch as well, then it could produce a partial
output.

Richard: If we didn't give that guarantee, you could still achieve
that result with a try/catch in the catch.

Norm: You'd have to nest them awfully deep...

Henry: I'm not trying to make a guarantee about success, just about
bounded consequences.

Henry: Straw man proposal: try/catch produces coherent output or
none. If the catch fails, it produces no output.

Norm: It seems to me that buffering in the try is going to be deeper
than the catch.

Richard: Sure, that's a common case, but there are going to be cases
where the try is going to try getting something from one place and the
catch is going to try to get it from another.

Henry: We don't yet have, but maybe we ought to provide, an identity
component that's gauranteed to output only whole documents. It only
produces either complete output or fails.

Alex: Couldn't we do this with an attribute on the catch?

Henry: Maybe we want to think about this for a week, but I kind of
like this idea. I can use it where I don't care if the pipeline fails
but I want to check the well-formedness of the input.

Some discussion of the semantics of this component.

Some discussion of sequences.

You need to be able to know when a sequence ends.

Richard: Suppose we don't have any try/catches, we just have a
pipeline with one of these special identity components followed by an
XSLT component. In the case where there's something wrong with the
document, what happens.

Henry: I thought the pipeline engine was required to check
well-formedness on the connection. The engine itself is required to
halt and catch fire at the first invalid token from the step.

Richard: I'm not sure I like that.

Henry: You have to put an rxp step in between...

Henry: If one uses SAX on every input stream, then the requirement is
satisfied because the SAX inputs will HCF if they see an illformed
document.

Richard: The consequence of this is that an illformed document isn't
going to get any particular kind of error. You won't be able to tell
if it's the substrate or the component that gets the error.

Henry: If I was compiling pipelines into SAX filters, what I said goes
out the window. There's nothing in the SAX filter story that stops
component 7 from producing an illformed sequence of events.

Richard: Suppose an illformed document is output but no one reads it?

Alex: We could say that it's only required to check on input.

Henry: As long as all components do a WF check on the way in, I think
we've satisfied this error.

Henry: What follows from sequencing and the discussion today is that
anyone who does anything interesting before getting their first input
or after producing their last output is asking for trouble.

Norm: What about components that don't consume all their input?

Henry: Are components obliged to consume all their input? I would have
said no.

Mohamend: How in this case do you verify that it's well-formed?

Henry: It amounts to saying that outputs of components like that are
like outputs of components not connected to anything, which isn't very
satisfying.

Alex: It's the underlying implementation that's responsible for
checking.

Henry: In the case where I'm streaming through a large document
looking for a key/value pair and when I find it I'm going to update a
database. I don't see why this component can't declare victory and
complete without ever reading the whole input document.

Mohamed: It's clear for me that this is the general case for for-each
and choose.

Alex: In a particular streaming implementation, you could stop. But in
some other implementation, you couldn't.

Some discussion of SAX/Xerces internals about aborting a parse part
way through.

Topic: Any other business?

None.

Mohamed: What about p:catalog?

Norm: I'll put caching/cataloging back on the agenda for next week.

Adjourned.
Received on Thursday, 12 April 2007 16:14:53 UTC