The XPath
transform output is the result of applying an XPath expression to an input
string. The XPath expression appears in a parameter element named
XPath
. The input string is equivalent to the result of
dereferencing the URI attribute of the Reference
element containing
the XPath transform, then, in sequence, applying all transforms that appear
before the XPath transform in the Reference
element's
Transforms
.
The primary purpose of this transform is to ensure that only specifically
defined changes to the input XML document are permitted after the signature is
affixed. The XPath expression can be created such that it includes all elements
except those meeting specific criteria. It is the responsibility of the XPath
expression author to ensure that all necessary information has been included in
the output such that modification of the excluded information does not affect
the interpretation of the output in the application context. One simple example
of this is the omission of an enveloped signature from a
DigestValue
calculation.
The XPath transform establishes the following evaluation context for the
XPath expression given in the XPath
parameter element:
The additional function here()
is defined as follows:
The here function returns a node-set containing the single node that directly bears the XPath expression. The node could be of any type capable of directly bearing text, especially text and attribute. This expression results in an error if the containing XPath expression does not appear in an XML document.
An XML processor is used to read the input XML document and produce a parse tree capable of being used as the initial context node for the XPath evaluation, as described in the previous section. If the input is not a well-formed XML document, then the XPath transform must throw an exception.
Validating and non-validating XML processors only behave in the same way (e.g. with respect to attribute value normalization and entity reference definition) until an external reference is encountered. If the XPath transform implementation uses a non-validating processor, and it encounters an external reference in the input document, then an exception must be thrown to indicate that the necessary algorithm is unavailable (The XPath transform cannot simply generate incorrect output since many applications distinguish an unverifiable signature from an invalid signature).
As a result of reading the input with an XML processor, linefeeds are normalized, attribute values are normalized, CDATA sections are replaced by their content, and entity references are recursively replaced by substitution text. In addition, consecutive characters are grouped into a single text node.
The XPath implementation is expected to convert the information in the input XML document and the XPath expression string to the UCS character domain prior to making any comparisons such that the result of evaluating the expression is equivalent regardless of the initial encoding of the input XML document and XPath expression.
The namespace prefix of each node appearing in the original document must be preserved by the XML processor used by the XPath transform implementation. This is necessary in order to produce the serialized result.
Although a node-set is unordered, based on the expression evaluation requirements of the XPath function library, the document order position of each node must be available, except for the attribute and namespace axes. The XPath transform imposes no order on attribute and namespace nodes during XPath expression evaluation, and expressions based on attribute or namespace node position are not interoperable. The XPath transform does define an order for namespace and attribute nodes during serialization.
For the purpose of serialization, the XPath transform imposes a document order on namespace and attribute nodes. An element's namespace and attribute nodes have a document order position greater than the element but less than any child node of the element. Namespace nodes have a lesser document order position than attribute nodes. An element's namespace nodes are sorted lexicographically by local name (the default namespace node, if one exists, has no local name and is therefore lexicographically least). An element's attribute nodes are sorted lexicographically with namespace URI as the primary key and local name as the secondary key (an empty namespace URI is lexicographically least). Lexicographic comparison is based on the UCS codepoint values, which is equivalent to lexical ordering based on UTF-8.
A node-set is converted into a string by generating the representative text for each node in the node-set in ascending document order. No node is processed more than once. Note that processing an element node E includes the processing of all members of the node-set for which E is an ancestor. Therefore, directly after the representative text for E is generated, E and all nodes for which E is an ancestor are removed from the node-set (or some logically equivalent operation occurs such that the node-set's next node in document order has not been processed).
The method of text generation is dependent on the node type and given in the following list:
xmlns=""
.
Then, generate the representative text for each namespace
node that is in the element's namespace axis and in the node-set,
except omit the namespace node with prefix xml
if its string value is http://www.w3.org/XML/1998/namespace
.&
, all double quote characters with
"
, and all whitespace characters (#x9, #xA, #xD, and #x20)
with character references, except for #x20 characters with no preceding #x20.
When whitespace characters are replaced, the character references are written
in uppercase hexadecimal with no leading zeroes (for example, #xD is represented
by the character reference
).&
, all open angle brackets (<) are replaced by
<
, and all #xD characters are replaced by

. 
. If the string value is
empty, then the leading space is not added. The QName of a node is either the local name if the namespace prefix string is empty or the namespace prefix, a colon, then the local name of the element. The namespace prefix used in the QName MUST be the same one which appeared in the input document.
The result of the XPath expression is a string, boolean, number, or node-set. If the result of the XPath expression is a string, then the string converted to UTF-8 is the output of the XPath transform. If the result is a boolean or number, then the XPath transform output is computed by converting the boolean or number to a string as if by a call to the XPath string() function, then converting to UTF-8. If the result of the XPath expression is a node-set, then the XPath transform result is computed by serializing the node-set with a UTF-8 encoding.
As an example, consider creating an enveloped signature (a Signature
element that is a descendant of an element being signed). However, the elements
within the signature are changing (e.g. the digest value must be put inside the
DigestValue
and the SignatureValue
must be subsequently
calculated). One way to prevent these changes from invalidating the digest value in
DigestValue
is to add an XPath Transform
that omits all
Signature
elements and their descendants. For example,
<Document>
...
<Signature xmlns="&dsig;">
<SignedInfo>
...
<Reference URI="">
<Transforms>
<Transform Algorithm="http://www.w3.org/TR/1999/REC-xpath-19991116">
<XPath>/descendant-or-self::node()[not(ancestor-or-self::Signature)]</XPath>
</Transform>
</Transforms>
<DigestMethod Algorithm="http://www.w3.org/2000/02/xmldsig#sha1"/>
<DigestValue></DigestValue>
</Reference>
</SignedInfo>
<SignatureValue></SignatureValue>
</Signature>
...
</Document>
The subexpression /descendant-or-self::node()
means that all nodes in
the entire parse tree starting at the root node are candidates for the result node-set.
For each node candidate, the node is included in the resultant node-set if and only if
the node test (the boolean expression in the square brackets) evaluates to
"true" for that node. The node test returns true for all nodes except nodes
that either have or have an ancestor with a tag of Signature
.
A more elegant solution uses the here function
to omit only the Signature
containing the XPath Transform, thus allowing
enveloped signatures to sign other signatures. In the example above, use the following
expression as the content of the XPath
element:
/descendant-or-self::node()
[
count(ancestor-or-self::Signature | here()/ancestor::Signature[1]) > count(ancestor-or-self::Signature)
]
Since the XPath equality operator converts node sets to string values before comparison,
we must instead use the XPath union operator (|). For each node of the document, the
predicate expression is true if and only if the node-set containing the node and its
Signature
element ancestors does not include the enveloped Signature
element containing the XPath expression (the union does not produce a larger set if
the enveloped Signature
element is in the node-set given by
ancestor-or-self::Signature
).
It is RECOMMENDED that the XPath be constructed such that the result of this operation is a well-formed XML document. This should be the case if root element of the input resource is included by the XPath (even if a number of its descendant nodes are omitted by the XPath expression). It is also RECOMMENDED that nodes should not be omitted from the input if they affect the interpretation of the output nodes in the application context. The XPath expression author is responsible for this since the XPath expression author knows the application context.