Removing whitespace from end of text file from Sheila Thomson on 2025-10-04 (xproc-dev@w3.org from October 2025)

From: Sheila Thomson <discuss@bluegumtree.com>
Date: Sat, 04 Oct 2025 01:38:11 +0100
To: xproc-dev@w3.org
Message-ID: <E46143A5-B9D6-479E-8BE9-23E07EEB462F@bluegumtree.com>

Hi all,

The answer to this question is probably in the (many) various specs relating to this question but I've not managed to find it, so apologies in advance.

I'm trying to remove whitespace characters from the end of a text file, for example:

"Prince\n\t\t\t\t\tPurple Rain (Multiplex) MH01365\n\t\t \n\n\t\t "

(Ignore the wrapping double quotes in the examples above and below, they're just there to make clear the start and end of the example; they don't exist in the actual text being evaluated.)

I'd like to get to:

"Prince\n\t\t\t\t\tPurple Rain (Multiplex) MH01365"

I've been trying to use p:text-replace:

        <p:text-replace>
            <p:with-option name="pattern" select="'\s+$'" />
            <p:with-option name="replacement" select="''" />
        </p:text-replace>

According to https://www.w3.org/TR/xmlschema-2/#regexs, the pattern \s should match the following characters: ,  space, tab, newline (#xA), return (#xD).  The same spec states that $ anchors the pattern at the tail of the text in scope.  So I am expecting \s+$ to match any contiguous combination of one or more space, tab, newline or return characters at the end of my text file.

But it only removes the final space, not the tabs or newlines (I'm testing with both Morgana and Calabash).

What am I doing wrong, please?

Sheila

Received on Saturday, 4 October 2025 00:38:21 UTC