- From: Imsieke, Gerrit, le-tex <gerrit.imsieke@le-tex.de>
- Date: Sun, 22 Nov 2020 19:04:05 +0100
- To: xproc-dev@w3.org
The p:un in p:urify() is totally intended btw:
https://github.com/xproc/3.0-specification/issues/451#issuecomment-405025086
On 22.11.2020 18:56, Imsieke, Gerrit, le-tex wrote:
>
>
> On 22.11.2020 18:12, Martin Honnen wrote:
>>>
>>> Because the percent signs as they are used in the filename are
>>> incompatible with URI encoding, I expect them to be percent-encoded
>>> themselves, with the modified filename echoed to stderr (in the
>>> <p:identity> step) and used to save the test file (in the <p:store>
>>> step). What happens instead is that the percent encoded value is
>>> written, as expected, to stderr:
>>>
>>> test-1a%257%25.xml
>>>
>>> but the file is saved to the local filesystem as if encode-for-uri()
>>> had not been applied, that is, as:
>>>
>>> test-1a%7%.xml
>>
>> I don't have an explanation for that, perhaps ask Achim by raising an
>> issue on Morgana on Sourceforge.
>>
>
> This is the correct behavior that you are observing.
>
> Quoting https://tools.ietf.org/html/rfc3986#section-2.4:
>
> Because the percent ("%") character serves as the indicator for
> percent-encoded octets, it must be percent-encoded as "%25" for that
> octet to be used as data within a URI.
>
> This means that 'test-1a%7%.xml' is not a valid URI. The (relative) URI
> that corresponds to this file name is 'test-1a%257%25.xml'. When it is
> used to store the file, the percent encoding will be undone, resulting
> in a file name 'test-1a%7%.xml'.
>
> Instead of encode-for-uri(), you can also use p:urify()
> (https://spec.xproc.org/master/head/xproc/#f.urify) that will only
> encode the parts of the file name (or URI) that need to be encoded.
>
> For example, p:urify('c:\Users\gerrit\test-1a%7%.xml') will result in
> 'file:///c:/Users/gerrit/test-1a%257%25.xml'
>
> p:urify('c:\Users\gerrit\test-1a%257%25.xml') →
> 'file:///c:/Users/gerrit/test-1a%25257%2525.xml' (the input isn’t a URI,
> therefore '%25' will be regarded as a literal part of the file name that
> must be percent-encoded as '%2525' in a URI.
>
> p:urify('file:///c:/Users/gerrit/test-1a%257%25.xml') →
> 'file:///c:/Users/gerrit/test-1a%257%25.xml' (no additional encoding of
> the '%25's because “Implementations must not percent-encode or decode
> the same string more than once” as stated in the same Sect. 2.4 of RFC
> 3986).
>
> Morgana reports 'file:///c:/Users/gerrit/test-1a%25257%2525.xml' as the
> result of the last invocation. I think this is incorrect. Otherwise,
> Morgana seems to implement p:urify() incredibly well.
>
> Gerrit
>
>
>
Received on Sunday, 22 November 2020 18:04:21 UTC