Re: String complements

For descending order, how would you achieve (without knowing the length of the longest string of the input) that 'ZZ' is returned before 'Z'? In other words, how would you simulate…

  sort(('ZZ', 'Z'), orders := 'descending')



Am 14.03.2024 17:06 schrieb Dimitre Novatchev <dnovatchev@gmail.com>:
>    >    For a simplified example, revert("abc") would produce "zyx" . This is doable and really valuable.
>
>      In what sense is “zyx” the complement of “abc”? Over what set of codepoints and in what collation?
>
>      I am very skeptical that such a function is well defined across all collations and will always produce a single, correct result in all cases.
>
>      Can you provide a detailed description of how this would work?

Yes, as Michael Kay already explained, this is doable if either: the "biggest" symbol in the collation is not used (which btw happens in some collations, for example the biggest symbol in the English(American) collation is 0xFE) - or add an additional symbol that is "bigger" than any other symbol in the collation.

Let us, just for convenience, refer to this special symbol as '$' (this is just a convention on how to refer to this special symbol, not the actual dollar character).

Then, if S1, S2, ..., Sn are all n symbols in the collation ordered by their value in the collation,  perform this mapping:

"" :             '$' ,
S1 : Sn || '$' ,
S2 : Sn-1 ||  '$' ,
.  .  .  .  .  .  .
Sk : Sn-k+1 || '$',
.  .  .  .  .  .  .

Sn : S1 || '$'

And certainly, adding a new symbol to a collation is actually creating a new collation, and this would maybe be the most straight-forward way of inverting strings.

We may not even create any new collation, we could just have a convention that a collation named "Inverted" || {Real-Collation-Name} produces the negated comparison results of the ones produced by the {Real-Collation-Name} collation. Or, as I mentioned before, this is the same as "decorating a collation".

This is one more way to get rid of the $orders parameter in our current functions.

Thanks,
Dimitre

On Thu, Mar 14, 2024 at 3:24 AM Norm Tovey-Walsh <norm@saxonica.com<mailto:norm@saxonica.com>> wrote:
Dimitre Novatchev <dnovatchev@gmail.com<mailto:dnovatchev@gmail.com>> writes:
>    This function can easily handle strings - produce a "string complement" in the value space for a particular collation.
>
>    For a simplified example, revert("abc") would produce "zyx" . This is doable and really valuable.

In what sense is “zyx” the complement of “abc”? Over what set of codepoints and in what collation?

I am very skeptical that such a function is well defined across all collations and will always produce a single, correct result in all cases.

Can you provide a detailed description of how this would work?

                                        Be seeing you,
                                          norm

--
Norm Tovey-Walsh
Saxonica

Received on Thursday, 14 March 2024 16:23:33 UTC