- From: Norman Walsh <ndw@nwalsh.com>
- Date: Thu, 26 Sep 2013 12:48:37 +0100
- To: public-xml-processing-model-wg@w3.org
- Message-ID: <m2vc1niz1m.fsf@nwalsh.com>
* XProc V.Next requirements ** MUST Simplify parameters Experience with parameters in XProc 1.0 reveals that they are too complicated. They often cause user confusion and introduce syntactic complexity not justified by their function. XProc v2.0 must dramatically simplify parameters, perhaps simply removing parameter ports altogether without replacing them with a new mechanism of equivalent power (and complexity). *** Consider the possibility of dropping parameter ports altogether and not replacing them with a new mechanism is in the frame ** MUST Integrate non-XML documents into the pipeline flow Experience has shown that real-world pipelines often involve non-XML documents. Several workarounds have been invented for special cases. The limitation that V1.0 can only pass XML between steps makes some pipelines difficult, if not impossible, to write. Providing the ability to allow non-XML documents to flow between steps opens up the possibility of writing simple pipelines to work with images, JSON, Turtle, EPUB, etc. *** Consider what required steps do with non-XML documents ** MUST Align with XQuery/XSLT 3.0 specifications Alignment with XQuery/XSLT 3.0 will keep features of XProc consistent with modern XML technologies: error handling, serialization options, XDM features, etc. In addition, support for XPath 1.0 no longer seems relevant; it adds complexity to the specification and is unlikely to be implemented today. XPath 1.0 support will be removed from XProc. *** XDM and Serialization *** Remove all support for XPath 1.0 *** Is our p:error step consistent with other languages? ** MUST Add explicit flow handling There are many pipelines for which the flow analysis does not provide a convenient or predictable ordering of steps. Because some steps have side effects not manifest in the pipeline, it may be necessary to ensure a particular order. This facility is not supported by XProc 1.0, but is available in implementation-defined extensions. XProc 2.0 will standardize this facility. *** A "depends-on" attribute? ** MUST Allow arbitrary XDM values in variables, options, and parameters XProc 1.0 restricts the values of variables, options, and parameters to be only strings. This has proven to be an inconvenient limitation. XProc 2.0 will allow variables, options, and parameters to have any XDM value insofar as possible. XProc 2.0 will also allow the required types of variables, options, and parameters to be specified. ** MUST Allow AVTs The syntactic sugar that allows step options to be expressed concisely as attribute values on a step is foiled whenever the value of the option must be computed by the pipeline. Allowing those options to contain XSLT-style attribute value templates (AVTs) would simplify many pipelines. Additionally, allowing AVTs in other places, such as the href attribute on p:document, will be considered. XSLT 3.0 introduces a feature which allows expressions in curly braces to be evaluated in element content. This feature is similar to the facility provided by the p:template step. Extending XProc to support curly braces in a manner consistent with XSLT 3.0 will be considered. *** Where? **** In the syntactic shortcut form of option values **** In a 'value' attribute on p:with-option, etc.? **** In the 'href' attribute of p:document? **** Support the XSLT 3.0 curly braces in element content? ** MUST Document backwards-incompatibilities in V.next pipelines Backwards incompatiblity is painful for users and will be avoided wherever possible. However, XProc 2.0 will introduce language features that are not backwards compatible with 1.0. The specification must document these incompatibilities. *** We may decide to make non-backwards compatible changes *** To what extent will 1.0 pipelines be 2.0 pipelines? *** What will cause a 2.0 processor to run a 1.0 pipeline with different semantics *** How will V.next play with the 1.0 "forwards compatibility" rules? ** SHOULD Make editorial improvements Implementation experience has demonstrated that there are areas of the specification that didn't get the balance right between precision for implementors and clarity for users, for example "non-step wrappers". The XProc 2.0 specification should attempt to resolve these problems without introducing inordinate complexity. The 1.0 specification also defines the p:pipeline element as a syntactic shortcut for a particular form of p:declare-step. While convenient in some circumstances, it has proven to be a source of some confusion especially among new users. XProc 2.0 may remove the p:pipeline element. *** Remove the concept of "non-step wrapper" *** Remove the p:pipeline element ** SHOULD Provide a way to associate arbitrary metadata with documents Adding metadata to documents is a natural thing for pipelines to do, either for subsequent use by the pipeline or for eventual output. For example, the serialization options provided in an XSLT stylesheet could be carried forward to the eventual serialization of the result document by the pipeline. In XProc 1.0, there's no way to maintain that association. XProc 2.0 should support the ability to associate processor and user-defined metadata with documents. *** Carrying serialization options forward *** MIME types associated with documents ** SHOULD Support steps with a dynamic number of inputs and outputs While most steps have a predetermined and static number of inputs and outputs, this is not universally the case. In XProc 1.0, a putative p:eval step which could run a dynamically constructed pipeline, for example, suffers from the limitation that the signature of the p:eval step usually differs from the signature of the evaluated pipeline. XProc 2.0 should provide a facility for supporting steps with a variable number of inputs and outputs. *** Split, Join, NVDL, Eval, etc. ** SHOULD Provide improves status information during pipeline execution XProc 1.0 provides scant support for reporting the status of a pipeline and providing aid to users attempt to debug pipelines. Implementation-defined extensions have demonstrated that some additional facilities, such as a p:message step, would be an aid to users. XProc 2.0 will add some mechanism for reporting status messages and will consider adding additional steps and/or language features to aid in analysing the behavior of a running pipeline. *** Support users attempting to debug pipeline errors *** p:message? *** p:message attribute on any step, using AVTs. ** SHOULD Provide a mechanism for importing user-defined functions Experience with user-defined functions in XQuery and XSLT reveals that they can be a powerful addition to the language. Providing some feature that allowed users to extend the vocabulary of functions available in, for example, the test expressions on p:when elements would greatly simplify some pipelines. Such a mechanism might take the form of the ability to load extension functions defined in, for example, XQuery, or it might include adding the ability to define functions in XProc. *** Defined in XQuery, XSLT, ... Python, Ruby, Scala, JavaScript, Perl? *** p:function step that defines functions? ** SHOULD Enhance try/catch to catch specific error codes Support for catching errors in XProc 1.0 is limited to a simple p:try/p:catch pair, which catches and handles all errors uniformly. To align XProc with modern languages, the try/catch mechanism will be extended to support the ability to catch specific errors and possibly with the addition of a "finally" construct. *** Multiple catch statements for specific errors *** p:finally? ** SHOULD Support a variety of syntactic simplifications XProc 1.0 offers relatively few default behaviors, requiring instead that pipelines specify every construct fully. User experience has demonstrated that this leads to very verbose pipelines and has been a constant source of complaint. XProc 2.0 will introduce a variety of syntactic simplifications as an aid to readability and usability, including but not limited to: *** <p:pipe step="name"/> should bind to the primary output port of the step named 'name'. It is an error if there is no such primary output port. *** <p:pipe port="secondary"/> should bind to the 'secondary' port of the step on which the default readable port occurs. It is an error if there is no such step. *** <p:input port="portname" href="..."/> should be a shortcut for a document binding to the URI specified in href. *** <p:input port="portname"/> should be a shortcut for an empty binding. *** Allow p:inline to be optional *** Allow curly brace expansion in p:inline (with an attribute to control whether or not that behavior is enabled) *** Provide a select attribute to p:for-each/p:viewport *** Change all steps with a single non-primary output to have a single primary output **** What are the semantics of select on p:for-each? *** Consider harmonizing p:viewport-source and p:iteration-source *** Add an AVT 'value' attribute to options, parameters, variables ** SHOULD Write a primer A new user introduction to XProc would aid adoption. ** SHOULD Consider using XDM everywhere In addition to supporting XDM values in variables, options, and parameters, XProc 2.0 might allow XDM values in more places, such as allowing p:for-each to iterate over a sequence of strings or integers. **** For example, selecting a sequence of strings with p:for-each ** SHOULD Consider dividing the spec into two parts XProc 1.0 is a specification that consists of both the language definition and the inventory of required and optional steps. Release management might be simplified by separating the language core from the vocabulary of steps and providing some sort of versioning strategy that allowed the vocabulary of steps to be revised more frequently. XProc 2.0 may be defined in more than one Rec-track specification document. **** Implies new versioning strategy? ** SHOULD Consider additional steps and enhancements The vocabulary of steps available in XProc is extensible. Users and implementors have developed additional steps. For example, to support pipelines that produce EPUB documents or manipulate files on disk. It is worth considering which, if any, new steps should be elevated to the XProc namespace. The candidates include, but are not limited to: *** p:zip *** p:unzip *** p:template (and XSLT 3.0 curly braces in element content) *** p:in-scope-names *** p:eval *** Semantic web steps (p:sparql, p:rdfa, ...)? *** Operating system steps (p:env, ...)? *** File system steps (p:mkdir, p:copy, ...)? Be seeing you, norm -- Norman Walsh Lead Engineer MarkLogic Corporation Phone: +1 512 761 6676 www.marklogic.com
Received on Thursday, 26 September 2013 11:49:17 UTC