- From: Nikolay Fiykov <nikolay.fiykov@nsn.com>
- Date: Tue, 16 Oct 2007 18:22:20 +0300
- To: public-xml-processing-model-comments@w3.org
- Message-ID: <4714D72C.9020302@nsn.com>
Hi, 1) Definitions: "2. Pipeline concepts" : definition of subpipeline is way too late, after being referenced in several other places. This was a rather troubling experience for first time spec readers. 2) Editorial: Example 1, 4 (and possibly other) features step named like actual p: tags ("pipeline"). Steps inputs and outputs are all names "source" and "result". I found it rather confusing, at least not until reading almost entire spec. Naming them like "main", "xslt-input" and etc. would safe quite some confusion. 3) Definitions: Definitions of "containers" and "ancestors" is not very clear, given the fact that "ancestor" is not defined at all. 4) Typo: "4.1 p:pipeline" --> "... when the it has ..." 5) Document model: in several places (p:for-each,p:viewport ant etc.) term "document node" is used. This suggests DOM Document object, right? If so, what would be the way to execute pipelines against large documents? If not, what exactly is to be understood? 6) Parallel subpipelines: As illustrated by the example for "p:for-each". I find it rather hard to trace the individual execution branches (linear executions), especially if I add few more steps inside. Although I can use "p:group" or pipeline libraries, I think non-linear pipelines have to be governed by a special construct. Also I can not find anything said in the spec about how the parallel branches will be executed: linear or in parallel. This is critical for processing (large) streams of data, where many small steps are involved and the stream cannot be read multiple times (but only once). I have several use cases (very important) where single input document would have to be processed by parallel pipelines and their results merged back together. For this, current idea is to use XProc to govern the overall data flow and multiple XSLTs steps (able to process in streaming mode) to perform the atomic operations. All this can be properly examined only if parallelism is explicitly present in the grammar. Finally, having special "p:parallel" or such construct would allow for a more clear and narrow interpretation of the spec. 7) "p:try/catch": Any particular reason why "p:finally" is not part of the construct? This is well know paradigm and missing "finally" is a bit confusing at first sight. 8) "p:serialize" : I'd happy to see also "exclude-prefixes" (after XSLT). 9) "p:pipe" : Aren't there too many pipe names mentioned: pipeline, subpipeline, pipeline libraries, p:pipeline, p:pipeline-library, p:pipe. For example "p:connect" or "p:bind" would do much better. 10) "p:document" : Nothing said about document stability. Are consequent executions allowed to return different documents or they have to be guaranteed to be the same (like XSLT)? 11) I'd appreciate if you publish XML Schemas for the results of following steps: p:count, p:directory-list, p:http-request, p:p:store 12) "p:directory-list" : option name="path" : "the value of the path must be an anyURI". Why it is not names "uriPath" then? 13) "p:directory-lsit" : option filter : I have use cases where I'd need to do directory scanning where single RegExp alone would not be enough. Any possibility to have "includes" and "excludes" (from Ant) added/instead? 14) Namespace rename: "2. Each response header ... is translated into c:header element". Short or long notation? 15) XSLT 2.0 : "If a sequence of documents is provided on the source port ...". Not clear to me how sequences are to be handled exactly. 16) XSLT 2.0 and "p:parameter" : passing of documents is impossible (only strings). There are cases when result of one transformation is needed in second. I'm not sure we need all the "overhead" of wrapping/unwrapping or similar to achieve that. 17) Steps evaluation: "A pipeline must behave as if it evaluated each step each time it occurs." : How XSLT templates caching can be achieved and at the same time be complaint with the spec? And without templates caching there will be significant performance penalty when using pipelines. Document stability has a role in this subject too. 18) On ability to process large documents: - there are multiple places where XPath expressions are expected. These steps cannot be executed against large steps (without defining a subset of XPath). - there are multiple places where "node sets" or wrapping "document nodes" are required to be produced. These too cannot be executed against large documents. - there are only few steps (required and optional) which can (potentially) operate on top of large documents (and thus perhaps using object models based on SAX). There is no explanation as how intermixing of steps with different underlying models is to be achieved. 19) "p:label-element" : scheme="count-elements" : Why is not valid XPath expression allowed here? In my case I'd use generate-id() for adding missing id attributes. BR, Nikolay Fiykov
Received on Tuesday, 16 October 2007 15:31:10 UTC