- From: Norm Tovey-Walsh <ndw@nwalsh.com>
- Date: Wed, 18 Mar 2026 12:12:20 +0000
- To: XProc Dev <xproc-dev@w3.org>
Hello,
A recent bug report has highlighted an interesting challenge in managing XProc steps. Consider this transformation:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0">
<xsl:template match="/">
<doc/>
</xsl:template>
</xsl:stylesheet>
The resulting document is “<doc/>” irrespective of the input (that’s not really a critical part of the issue, it’s just a simplification for this example). What is the location of that element node? One plausible answer (the answer that XML Calabash reports) is line 4, column 9. That’s the location of the literal result element in the stylesheet. That’s the only plausible answer that XSLT *can* give since the location of the element in the stylesheet is all it knows.
But if you pass the transformation result document to a validation step that says “doc” isn’t valid, the reported location: line 4, column 9, is likely to be confusing.
The first thought you might have is that the XSLT step should “renumber” the locations. And it could plausibly do that, but not without having to make some decision about how many lines and columns an element start tag occupies. If you actually serialize the document, then the number of lines and columns an element start tag occupies is highly dependent on the serialization options used in the serialization, so there isn’t a “right” answer!
As a pipeline author, I think the only way you can assure that you get accurate location information is if you parse the document from some serialization. If you want to transform then validate, and you want the error locations to be correct, you have to manage a p:store/p:load pair of steps and you have to understand that the reported error location will depend on the p:store serialization and not necessarily on any “original” version.
That’s unfortunate, but I don’t see any way around it.
For the p:xslt step, I think there are two possibilities for an implementation: it can discard the line and column information so that they don’t appear in any validation report, which avoids confusion, or it can leave the locations alone, which may be confusing, but is sometimes correct and useful.
(I suppose the p:xslt step *could* serialize and reparse the results, but that would potentially be very expensive and the user wouldn’t necessarily have access to the intermediate serialized form, so there’s no guarantee that the resulting line and column numbers wouldn’t be wrong with respect to whatever serialization they have available. So it seems like a very bad option.)
I really don’t know what’s is best.
Be seeing you,
norm
--
Norm Tovey-Walsh <ndw@nwalsh.com>
https://norm.tovey-walsh.com/
> The reason lightning doesn't strike twice in the same place is that the same place isn't there the second time.--Willie Tyler
Received on Wednesday, 18 March 2026 12:12:30 UTC