- From: Toman, Vojtech <vojtech.toman@emc.com>
- Date: Mon, 30 Sep 2013 05:51:36 -0400
- To: "public-xml-processing-model-comments@w3.org" <public-xml-processing-model-comments@w3.org>
I have seen people creating steps that have two (or more) input/output ports where one is used for the "main" data that is being processed and the others are used to access additional information about the individual documents. There is a 1-to-1 correspondence between the two, and this approach relies on the exact same order of the documents. If the order cannot be guaranteed, I think the proposed document metadata XProc V2 feature might help in some cases, but in general, I think that people who would want to implement any kind of pair-wise operation would be in trouble. Ordering of connections is also important for parameters. Without a predictable order, you cannot rely on consistent parameter overriding. Again, this should go away if we replace/drop the current parameters in V2. Relaxing the ordering would also have impact on some of the standard steps, for instance p:pack, p:split-sequence (the "initial-only" option), p:wrap-sequence, or p:xquery (again, I have seen people who pass a fixed-order sequence of documents to the step: the initial context item is the primary data to query, and the others are auxiliary resources used by the query). So I think I agree with Romain and Norm that having an option to indicate that the order does not matter is probably the most sensible way to go. Regards, Vojtech -- Vojtech Toman Consultant Software Engineer EMC | Information Intelligence Group vojtech.toman@emc.com http://developer.emc.com/xmltech > -----Original Message----- > From: Norman Walsh [mailto:ndw@nwalsh.com] > Sent: Sunday, September 29, 2013 11:49 AM > To: Romain Deltour > Cc: public-xml-processing-model-comments@w3.org > Subject: Re: Threading and ordering > > Romain Deltour <rdeltour@gmail.com> writes: > > That said, this is not a *strict* dependence, we could certainly find > > a workaround if the XProc spec was to change. Another option would be > > to keep the default behavior and add an option to explicitly declare > > when the order doesn't matter, e.g. using an extra attribute on the > > p:input and p:output ? > > Yes, that's about where I've come to in thinking about it. If the order > matters, some (in the worst case, all but one) documents will have to > be buffered so that they can be delivered in the right order. > > Giving pipeline authors a way to indicate that order doesn't matter > will potentially make some pipelines consume less memory and run > faster. > > Be seeing you, > norm > > -- > Norman Walsh > Lead Engineer > MarkLogic Corporation > Phone: +1 512 761 6676 > www.marklogic.com
Received on Monday, 30 September 2013 09:52:19 UTC