The core components are required to be implemented by all conforming processors.
The [p:]identity component makes a verbatim copy of its input available on its output port.
Table 1.1. Ports
Name | Type | Sequences? | Description |
---|---|---|---|
input | input | Yes | The sequence to be repeated as the output. |
result | output | Yes | A copy of the input. |
The use of 'input' as port name here aren't probably the best choice. Since the identity component must be able to take a sequence of documents, the port name 'document' isn't appropriate.
The [p:]xslt component applies a transformation supplied by the input to the 'transform' port to the document provided on the 'document' port. It may produce a sequence of documents on its 'result' port.
The [p:]xinclude component applies xinclude processing semantics to the document. The referenced documents are calculated against the base URI and are not provided as input to the component.
The [p:]serialize component applies XML serialization to the children of the document element and replaces those children with their serialization. The outcome is a single element with text content that represents the "escaped" syntax of the children if they were serialized.
The [p:]parse component takes the text value of the document element and parses the content as if it was and unicode character stream containing XML. The outcome is a single element with children from the parsing of the XML content. This is the reverse of the [p:]serialize component.
When the text value is parsed, a document element wrapper should be assumed so that element siblings can be parsed back into XML. Further, if the 'namespace' parameter is specified, the default namespace is declared on that wrapper element. If a wrapper element name is specified, it is not returned in the result.
If the 'content-type' parameter is specified, an implementation can use a different parser to produce XML content. Such a behavior is implementation defined. For example, for the mime type 'text/html', an implementation might provide an HTML to XHTML parser (e.g. Tidy).
The [p:]load component has no inputs but takes a parameter that specifies a URI of an XML resource that should be loaded and provided as the result.
The [p:]store component stores a serialized version of its input to a URI. The URI is either specified explicitly by the 'href' parameter or implicitly by the base URI of the document. This component has no output.
Should this component allow sequences on its input? The load component can't load sequences, but it seems rather harmless to allow sequences.
The [p:]xquery component applies a query to a collection and provides the result of the query as its output.
The [p:]xinclude component applies xinclude processing semantics to the document. The referenced documents are calculated against the base URI and expected to be in the sequence of documents provided on the 'collection' input port. It is an error if no such document can be found in the collection for any xincluded resource.
The [p:]validate component performs simple schema validation given a document and a set of schema document. The optional parameter mode can be used to pass initial state parameters like 'lax' or 'strict' to XML Schema.
If the 'assert' parameter value is 'true' or not present, the component will throw an error if the document fails to pass validation.
Table 1.13. Ports
Name | Type | Sequences? | Description |
---|---|---|---|
document | input | No | The document to be validated. |
schema | input | Yes | The schema to use for validation. The grammar must be understood by the implementation. Also, if a sequence of document is provided, they are considered a set of schema documents that may reference each other. |
result | output | No | The result. |
The [p:]validate component performs schema validation given a document and a description of a schema context. The schema context description is a document that details the namespaces and locations of the schema documents known to the validation. The optional parameter mode can be used to pass initial state parameters like 'lax' or 'strict' to XML Schema.
If the 'assert' parameter value is 'true' or not present, the component will throw an error if the document fails to pass validation.
The [p:]url-action component acts as a filter much like the xinclude component except that it acts upon [c:]url-action elements. Each of these elements represents a interaction with a URL via a method.
The result of processing is the input document where the [c:]url-action elements are replaced with the results of acting upon the resource with the method specified. The results are expected to be in XML. If they are not, implementations must follow these rules:
Any XML mime type must be parsed as XML.
Any non-XML text mime type must result in character data.
Any binary mime-type must result in a [c:]data element with a content-type attribute containing the mime-type and base64 encoded content.
Requests are made by formulating elements structure as:
<c:url-action method="{verb}" content-type="{mime-type}" status-only="{true|false}" override-type="{mime-type}"> ... </c:url-action>
If the method allows data to be sent, the serialized children will be sent as the entity body of the message. Also, in some instances (e.g. HTTP GET), the 'content-type' attribute may not be used. Its presence is not an error even if it is not used.
In certain situations, the mime type of the received data may not be appropriate. As such, the 'override-type' provides a means for the pipeline author to specified the mime-type to be used regardless of what is returned.
For example, to POST form data to a web service, the following element could be formulated:
<c:url-action method="post" content-type="application/x-www-form-urlencoded"> name=xproc&version=1.0 </c:url-action>
The [p:]join-documents component combines a set of input ports into one result with a document element whose name is specified by the parameter 'name'. This component may have any number of input ports.
The [p:]join-sequences component combines a set of input ports that may contain sequences into one sequence. This component may have any number of input ports.
The [p:]match-sequence component subsets a sequence of documents based on an XPath expression. Any document in the sequence for which the boolean value conversion of the expression evaluated against the root context is true will remain in the sequence as part of the component's output.
The [p:]rename component renames elements or attributes in a document based on parameter values.
The [p:]wrap component wraps the document element with a new document element.
The [p:]insert component insert a document specified on the 'insertion' port as a child of the document element provided on the 'document' port. The position of this insert is governed by the parameters.
The [p:]set-attributes component sets attribute values on the document element using the attribute values provided on the document element of the 'attribute' port's document.
<p:pipeline-library name="standard" xmlns:p="http://www.w3.org/2006/XProc"> <p:declare-step type="p:identity"> <p:input port="input" sequence="yes"/> <p:output port="result" sequence="yes"/> </p:declare-step> <p:declare-step type="p:xslt"> <p:input port="document" sequence="no"/> <p:input port="transform" sequence="no"/> <p:output port="result" sequence="yes"/> <p:parameter name="*"/> </p:declare-step> <p:declare-step type="p:xinclude"> <p:input port="document" sequence="no"/> <p:output port="result" sequence="no"/> </p:declare-step> <p:declare-step type="p:serialize"> <p:input port="document" sequence="no"/> <p:output port="result" sequence="no"/> </p:declare-step> <p:declare-step type="p:parse"> <p:input port="document" sequence="no"/> <p:output port="result" sequence="no"/> <p:parameter name="namespace" required="no"/> <p:parameter name="content-type" required="no"/> </p:declare-step> <p:declare-step type="p:load"> <p:output port="result"/> <p:parameter name="href" required="yes"/> </p:declare-step> <p:declare-step type="p:store"> <p:input port="document"/> <p:parameter name="href" required="no"/> </p:declare-step> <p:declare-step type="p:xquery"> <p:input port="collection"/> <p:input port="query"/> <p:output port="result"/> <p:parameter name="*"/> </p:declare-step> <p:declare-step type="p:xinclude-from-collection"> <p:input port="document" sequence="no"/> <p:input port="collection" sequence="yes"/> <p:output port="result" sequence="no"/> </p:declare-step> <p:declare-step type="p:join-documents"> <p:input port="*" sequence="yes"/> <p:output port="result"/> <p:parameter port="name" required="yes"/> </p:declare-step> <p:declare-step type="p:join-sequences"> <p:input port="*" sequence="yes"/> <p:output port="result" sequence="yes"/> </p:declare-step> <p:declare-step type="p:match-sequence"> <p:input port="sequence" sequence="yes"/> <p:output port="result" sequence="yes"/> <p:parameter name="expresion" required="true"/> </p:declare-step> <p:declare-step type="p:validate"> <p:input port="document"/> <p:input port="schema" sequence="yes"/> <p:output port="result"/> <p:parameter name="assert"/> <p:parameter name="mode"/> </p:declare-step> <p:declare-step type="p:validate-from-context"> <p:input port="document"/> <p:input port="context"/> <p:output port="result"/> <p:parameter name="assert"/> <p:parameter name="mode"/> </p:declare-step> <p:declare-step type="p:url"> <p:declare-input port="document"/> <p:declare-output port="result"/> </p:declare-step> <p:declare-step type="p:rename"> <p:input port="document"/> <p:output port="result"/> <p:parameter name="select" required="no"/> <p:parameter name="name" required="yes"/> </p:declare-step> <p:declare-step type="p:wrap"> <p:input port="document"/> <p:output port="result"/> <p:parameter name="name" required="yes"/> </p:declare-step> <p:declare-step type="p:insert"> <p:input port="document"/> <p:input port="insertion"/> <p:output port="result"/> <p:parameter name="at-start" required="yes"/> </p:declare-step> <p:declare-step type="p:set-attributes"> <p:input port="document"/> <p:input port="attributes"/> <p:output port="result"/> </p:declare-step> </p:pipeline-library>