Re: Removing whitespace from end of text file

OMG. Embarrassing myself again 🤣

I now realise what the problem is. It seems that a step earlier in the process is turning the control characters into their shorthand codes, eg. the tab character is changed to literal "\t", so, of course, my regular expression isn't working!

It hadn't occurred to me that this might happen and because I was already thinking of the control characters in terms of their regex shorthand, I didn't notice that the conversion had happened!

Norm, thank you for testing it and feeding back.  It was the help I needed.  

Now to work out where the unwanted conversion is happening.

Sheila



On 4 October 2025 09:25:33 BST, Norm Tovey-Walsh <ndw@nwalsh.com> wrote:
>Sheila Thomson <discuss@bluegumtree.com> writes:
>> The answer to this question is probably in the (many) various specs relating to this question but I've not managed to find it, so apologies in advance.
>
>I’m perplexed. 
>
>I tried your example and I reproduced it. That’s odd, I thought, and I went off to the fn:replace description, eventually found my way to the ā€œsā€ flag, added the ā€œsā€ flag and it did the right thing.
>
>Problem solved, I thought.
>
>Then I read:
>
>> But it only removes the final space, not the tabs or newlines (I'm testing with both Morgana and Calabash).
>
>And I thought, ā€œhang on, ā€˜s’ shouldn’t have effected tabs.ā€ And I went back and re-read the section on the ā€œsā€ flag and thought ā€œhang on, ā€˜s’ shouldn’t have effected ā€˜\s’, either.ā€
>
>So I took the ā€œsā€ flag out and…now I can’t reproduce the problem.
>
>Your step, exactly as written, seems to work just fine as I’d expect it to now that I’ve read my way through the docs.
>
>I’ve saved your example document in textfile.txt:
>
>$ od -a textfile.txt
>0000000    P   r   i   n   c   e  nl  ht  ht  ht  ht  ht   P   u   r   p
>0000020    l   e  sp   R   a   i   n  sp   (   M   u   l   t   i   p   l
>0000040    e   x   )  sp   M   H   0   1   3   6   5  nl  ht  ht  sp  nl
>0000060   nl  ht  ht  sp
>0000064
>
>This pipeline:
>
><p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
>                name="main" version="3.1">
>  <p:output port="result"/>
>
>  <p:text-replace>
>    <p:with-input>
>      <p:document href="textfile.txt"/>
>    </p:with-input>
>    <p:with-option name="pattern" select="'\s+$'" />
>    <p:with-option name="replacement" select="''" />
>  </p:text-replace>
>
>  <p:wrap-sequence wrapper="wrap"/>
>
></p:declare-step>
>
>produces 
>
><wrap>Prince
>                                        Purple Rain (Multiplex) MH01365</wrap>
>
>That’s true for both XML Calabash and Morgana (1.6.8, I haven’t updated recently).
>
>I’m a bit, as I said, perplexed. I genuinely don’t think I imagined that I reproduced the problem initially, but I don’t have a better explanation.
>
>                                        Be seeing you,
>                                          norm
>
>--
>Norm Tovey-Walsh <ndw@nwalsh.com>
>https://norm.tovey-walsh.com/
>
>> A polar bear is just another way of expressing a rectangular bear.
>

Received on Saturday, 4 October 2025 12:02:23 UTC