Chapter 1. Standard Components

1. Core Components

The core components are required to be implemented by all conforming processors.

1.1. [p:]identity

The [p:]identity component makes a verbatim copy of its input available on its output port.

Table 1.1. Ports

NameTypeSequences?Description
inputinputYesThe sequence to be repeated as the output.
resultoutputYesA copy of the input.

Note

The use of 'input' as port name here aren't probably the best choice. Since the identity component must be able to take a sequence of documents, the port name 'document' isn't appropriate.

1.2. [p:]xslt

The [p:]xslt component applies a transformation supplied by the input to the 'transform' port to the document provided on the 'document' port. It may produce a sequence of documents on its 'result' port.

Table 1.2. Ports

NameTypeSequences?Description
documentinputNoThe document to be transformed.
transforminputNoThe XSLT transformation.
resultoutputYesThe result of the transformation.

1.3. [p:]xinclude

The [p:]xinclude component applies xinclude processing semantics to the document. The referenced documents are calculated against the base URI and are not provided as input to the component.

Table 1.3. Ports

NameTypeSequences?Description
documentinputNoThe document to have xinclude processing applied to it..
resultoutputNoThe result after xinclude processing.

1.4. [p:]serialize

The [p:]serialize component applies XML serialization to the children of the document element and replaces those children with their serialization. The outcome is a single element with text content that represents the "escaped" syntax of the children if they were serialized.

Table 1.4. Ports

NameTypeSequences?Description
documentinputNoThe input document.
resultoutputNoThe resulting document element containing the serialized children.

1.5. [p:]parse

The [p:]parse component takes the text value of the document element and parses the content as if it was and unicode character stream containing XML. The outcome is a single element with children from the parsing of the XML content. This is the reverse of the [p:]serialize component.

When the text value is parsed, a document element wrapper should be assumed so that element siblings can be parsed back into XML. Further, if the 'namespace' parameter is specified, the default namespace is declared on that wrapper element. If a wrapper element name is specified, it is not returned in the result.

If the 'content-type' parameter is specified, an implementation can use a different parser to produce XML content. Such a behavior is implementation defined. For example, for the mime type 'text/html', an implementation might provide an HTML to XHTML parser (e.g. Tidy).

Table 1.5. Ports

NameTypeSequences?Description
documentinputNoThe input document.
resultoutputNoThe resulting document element containing the serialized children.

Table 1.6. Parameters

NameTypeRequired?Description
namespacexs:anyURINoA default namespace to be used in conjunction with the wrapper element.
content-typexs:stringNoA mime type whose value and affect is implementation defined.

1.6. [p:]load

The [p:]load component has no inputs but takes a parameter that specifies a URI of an XML resource that should be loaded and provided as the result.

Table 1.7. Ports

NameTypeSequences?Description
resultoutputNoThe document loaded.

Table 1.8. Parameters

NameTypeRequired?Description
hrefxs:anyURIYesThe location of the document to be loaded.

1.7. [p:]store

The [p:]store component stores a serialized version of its input to a URI. The URI is either specified explicitly by the 'href' parameter or implicitly by the base URI of the document. This component has no output.

Note

Should this component allow sequences on its input? The load component can't load sequences, but it seems rather harmless to allow sequences.

Table 1.9. Ports

NameTypeSequences?Description
documentinputNoThe document to store.

Table 1.10. Parameters

NameTypeRequired?Description
hrefxs:anyURINoThe location to which to store the document.

2. Optional Components

2.1. [p:]xquery

The [p:]xquery component applies a query to a collection and provides the result of the query as its output.

Table 1.11. Ports

NameTypeSequences?Description
collectioninputYesThe collection of documents to query.
queryinputNoThe source of the query.
resultoutputNoThe result of the query.

2.2. [p:]xinclude-from-collection

The [p:]xinclude component applies xinclude processing semantics to the document. The referenced documents are calculated against the base URI and expected to be in the sequence of documents provided on the 'collection' input port. It is an error if no such document can be found in the collection for any xincluded resource.

Table 1.12. Ports

NameTypeSequences?Description
documentinputNoThe document to have xinclude processing applied to it..
collectioninputYesThe set of documents to be referenced in the xinclude.
resultoutputNoThe result after xinclude processing.

2.3. [p:]validate

The [p:]validate component performs simple schema validation given a document and a set of schema document. The optional parameter mode can be used to pass initial state parameters like 'lax' or 'strict' to XML Schema.

If the 'assert' parameter value is 'true' or not present, the component will throw an error if the document fails to pass validation.

Table 1.13. Ports

NameTypeSequences?Description
documentinputNoThe document to be validated.
schemainputYesThe schema to use for validation. The grammar must be understood by the implementation. Also, if a sequence of document is provided, they are considered a set of schema documents that may reference each other.
resultoutputNoThe result.

Table 1.14. Parameters

NameTypeRequired?Description
assertxs:booleanNoIf the value is 'true', the component will fail if validation fails.
modexs:stringNoA schema language specific validation mode. For XML Schema, the appropriate values are 'strict' and 'lax'.

2.4. [p:]validate-from-context

The [p:]validate component performs schema validation given a document and a description of a schema context. The schema context description is a document that details the namespaces and locations of the schema documents known to the validation. The optional parameter mode can be used to pass initial state parameters like 'lax' or 'strict' to XML Schema.

If the 'assert' parameter value is 'true' or not present, the component will throw an error if the document fails to pass validation.

Table 1.15. Ports

NameTypeSequences?Description
documentinputNoThe document to be validated.
contextinputNoA document that contains a map of namespaces to schema documents.
resultoutputNoThe result.

Table 1.16. Parameters

NameTypeRequired?Description
assertxs:booleanNoIf the value is 'true', the component will fail if validation fails.
modexs:stringNoA schema language specific validation mode. For XML Schema, the appropriate values are 'strict' and 'lax'.

2.5. [p:]url-action

The [p:]url-action component acts as a filter much like the xinclude component except that it acts upon [c:]url-action elements. Each of these elements represents a interaction with a URL via a method.

The result of processing is the input document where the [c:]url-action elements are replaced with the results of acting upon the resource with the method specified. The results are expected to be in XML. If they are not, implementations must follow these rules:

  1. Any XML mime type must be parsed as XML.

  2. Any non-XML text mime type must result in character data.

  3. Any binary mime-type must result in a [c:]data element with a content-type attribute containing the mime-type and base64 encoded content.

Requests are made by formulating elements structure as:

<c:url-action method="{verb}" 
              content-type="{mime-type}" 
              status-only="{true|false}" 
              override-type="{mime-type}">
...
</c:url-action>

If the method allows data to be sent, the serialized children will be sent as the entity body of the message. Also, in some instances (e.g. HTTP GET), the 'content-type' attribute may not be used. Its presence is not an error even if it is not used.

In certain situations, the mime type of the received data may not be appropriate. As such, the 'override-type' provides a means for the pipeline author to specified the mime-type to be used regardless of what is returned.

For example, to POST form data to a web service, the following element could be formulated:

<c:url-action method="post" content-type="application/x-www-form-urlencoded">
name=xproc&amp;version=1.0
</c:url-action>

Table 1.17. Ports

NameTypeSequences?Description
documentinputNoThe document to have xinclude processing applied to it..
resultoutputNoThe result after xinclude processing.

2.6. [p:]join-documents

The [p:]join-documents component combines a set of input ports into one result with a document element whose name is specified by the parameter 'name'. This component may have any number of input ports.

Table 1.18. Ports

NameTypeSequences?Description
resultoutputNoThe result after aggregating the inputs.

Table 1.19. Parameters

NameTypeRequired?Description
namexs:QNameYesThe wrapper name.

2.7. [p:]join-sequences

The [p:]join-sequences component combines a set of input ports that may contain sequences into one sequence. This component may have any number of input ports.

Table 1.20. Ports

NameTypeSequences?Description
resultoutputYesThe combined sequence of documents.

2.8. [p:]match-sequence

The [p:]match-sequence component subsets a sequence of documents based on an XPath expression. Any document in the sequence for which the boolean value conversion of the expression evaluated against the root context is true will remain in the sequence as part of the component's output.

Table 1.21. Ports

NameTypeSequences?Description
inputinputYesThe sequence to subset.
resultoutputYesThe subset of the sequence.

Table 1.22. Parameters

NameTypeRequired?Description
matchxs:stringYesThe match pattern expression.

3. Micro-Operations Components

3.1. [p:]rename

The [p:]rename component renames elements or attributes in a document based on parameter values.

Table 1.23. Ports

NameTypeSequences?Description
documentinputNoThe document containing items to be renamed.
resultoutputNoThe document with the items renamed.

Table 1.24. Parameters

NameTypeRequired?Description
namexs:QNameYesThe new name.
selectxs:stringYesThe XPath expression that selects the items to be renamed.

3.2. [p:]wrap

The [p:]wrap component wraps the document element with a new document element.

Table 1.25. Ports

NameTypeSequences?Description
documentinputNoThe document to be wrapped.
resultoutputNoThe document after being wrapped.

Table 1.26. Parameters

NameTypeRequired?Description
namexs:QNameYesThe wrapper name.

3.3. [p:]insert

The [p:]insert component insert a document specified on the 'insertion' port as a child of the document element provided on the 'document' port. The position of this insert is governed by the parameters.

Table 1.27. Ports

NameTypeSequences?Description
documentinputNoThe document into which the insertion is performed.
insertioninputNoThe document to be inserted.
resultoutputNoThe combined result.

Table 1.28. Parameters

NameTypeRequired?Description
at-startxs:booleanNoA true value indicates the insertion should be the first child of the document element. The default is true.

3.4. [p:]set-attributes

The [p:]set-attributes component sets attribute values on the document element using the attribute values provided on the document element of the 'attribute' port's document.

Table 1.29. Ports

NameTypeSequences?Description
documentinputNoThe document where the attribute are to be set.
attributesinputNoThe document containing the attributes.
resultoutputNoThe combined result.

4. Component Definitions

<p:pipeline-library name="standard" xmlns:p="http://www.w3.org/2006/XProc">

   <p:declare-step type="p:identity">
      <p:input port="input" sequence="yes"/>   
      <p:output port="result" sequence="yes"/>   
   </p:declare-step>

   <p:declare-step type="p:xslt">
      <p:input port="document" sequence="no"/>   
      <p:input port="transform" sequence="no"/>   
      <p:output port="result" sequence="yes"/>   
      <p:parameter name="*"/>
   </p:declare-step>

   <p:declare-step type="p:xinclude">
      <p:input port="document" sequence="no"/>   
      <p:output port="result" sequence="no"/>   
   </p:declare-step>
   
   <p:declare-step type="p:serialize">
      <p:input port="document" sequence="no"/>   
      <p:output port="result" sequence="no"/>   
   </p:declare-step>
   
   <p:declare-step type="p:parse">
      <p:input port="document" sequence="no"/>   
      <p:output port="result" sequence="no"/>
      <p:parameter name="namespace" required="no"/>  
      <p:parameter name="content-type" required="no"/>  
   </p:declare-step>
   
   <p:declare-step type="p:load">
      <p:output port="result"/>  
      <p:parameter name="href" required="yes"/>  
   </p:declare-step>
   
   <p:declare-step type="p:store">
      <p:input port="document"/>  
      <p:parameter name="href" required="no"/>  
   </p:declare-step>
   
   <p:declare-step type="p:xquery">
      <p:input port="collection"/>   
      <p:input port="query"/>   
      <p:output port="result"/>   
      <p:parameter name="*"/>
   </p:declare-step>

   <p:declare-step type="p:xinclude-from-collection">
      <p:input port="document" sequence="no"/>   
      <p:input port="collection" sequence="yes"/>   
      <p:output port="result" sequence="no"/>   
   </p:declare-step>
   
   <p:declare-step type="p:join-documents">
      <p:input port="*" sequence="yes"/>   
      <p:output port="result"/>   
      <p:parameter port="name" required="yes"/>
   </p:declare-step>
   
   <p:declare-step type="p:join-sequences">
      <p:input port="*" sequence="yes"/>   
      <p:output port="result" sequence="yes"/>   
   </p:declare-step>
   
   <p:declare-step type="p:match-sequence">
      <p:input port="sequence" sequence="yes"/>
      <p:output port="result" sequence="yes"/>
      <p:parameter name="expresion" required="true"/>
   </p:declare-step>

   <p:declare-step type="p:validate">
      <p:input port="document"/>   
      <p:input port="schema" sequence="yes"/>
      <p:output port="result"/>   
      <p:parameter name="assert"/>
      <p:parameter name="mode"/>
   </p:declare-step>
   
   <p:declare-step type="p:validate-from-context">
      <p:input port="document"/>   
      <p:input port="context"/>
      <p:output port="result"/>   
      <p:parameter name="assert"/>
      <p:parameter name="mode"/>
   </p:declare-step>

   <p:declare-step type="p:url">
      <p:declare-input port="document"/>   
      <p:declare-output port="result"/>   
   </p:declare-step>
   
   <p:declare-step type="p:rename">
      <p:input port="document"/>
      <p:output port="result"/>  
      <p:parameter name="select" required="no"/>
      <p:parameter name="name" required="yes"/>
   </p:declare-step>
   
   <p:declare-step type="p:wrap">
      <p:input port="document"/>
      <p:output port="result"/>  
      <p:parameter name="name" required="yes"/>
   </p:declare-step>
   
   <p:declare-step type="p:insert">
      <p:input port="document"/>   
      <p:input port="insertion"/>   
      <p:output port="result"/>   
      <p:parameter name="at-start" required="yes"/>
   </p:declare-step>
   
   <p:declare-step type="p:set-attributes">
      <p:input port="document"/>   
      <p:input port="attributes"/>   
      <p:output port="result"/>   
   </p:declare-step>
   
</p:pipeline-library>