Re: Action Item QT4CG-004-02: DN to make a proposal for deep-equal-safe for future discussion ( Re: Draft minutes for QT4CG meeting 004, 2022-09-27)

Comparing two document trees by content is a fairly rare requirement, and nearly all the use cases I know of are (a) to compare expected test results, or (b) to see if anything has changed. For those requirements, the fact that deep-equal isn't transitive and can fail isn't a big deal. Anything that requires repeated comparisons (such as sorting, searching for duplicates, or building a map) is likely to be very inefficient and the user would be better off meeting the requirement with document signatures or similar. I'd like to see the use case, and to form a view as to whether a document-signature() function would be a more useful way to meet the requirement...

Michael Kay
Saxonica

> On 28 Sep 2022, at 15:04, Dimitre Novatchev <dnovatchev@gmail.com> wrote:
> 
> 
> 
> On Tue, Sep 27, 2022 at 11:41 PM Michael Kay <mike@saxonica.com <mailto:mike@saxonica.com>> wrote:
> A couple of comments:
> 
> (a) I have proposed (somewhere) changing the semantics of numeric comparisons using eq to convert both values to decimal rather than to double (as op:same-key does). But of course NaN=NaN would remain false. This function could take advantage of this.
> 
> Yes, certainly. Where?
> 
> (b) The function inherits most of the weaknesses of fn:deep-equal. For example it's clearly a design mistake that comments and processing instructions are ignored without merging their adjacent sibling text nodes; the rules for comparing typed and untyped content are also pretty unusable, as is the treatment of whitespace. It's also unfortunate that fn:deep-equal gives a different result from serializing into XML canonical form and comparing the serializations. It's hard to know whether this matters without considering use cases for this new function; but if we're defining a new function, then we ought to fix the known faults in the old one.
> 
> The purpose of this function is not to correct design mistakes in fn:deep-equal, but just to provide a function that can be used for comparisons (such as of keys of maps even if maps could have any sequence as a key)  that would return false() instead of raising errors, that would be context-free and transitive. Another use case is to use it as a possible default value for a comparer function arguments for any functions that will need or benefit from having a comparer - argument.
>  
> 
> In practice I don't think it's possible to define a set of rules for comparing node trees that satisfies a wide range of use cases without parameterising it. Even the rules for canonical XML are parameterized (IIRC) in regards to their handling of namespaces and whitespace.
> 
> Yes, and we can choose a set of default values for these parameters to be used in a default-comparer-function. These two ideas are complementary to each other, not in conflict with each other.
>  
> 
> So it boils down to: what are the use cases that this new function is designed for?
> 
> As mentioned above. And any use-case for the separately/independently proposed, and included by you in the PDF checklist document, deep-equal with options, is also a use case for providing this function as a default-comparer.
>  
> 
> Michael Kay
> 
> 
> Thanks,
> Dimitre
>  
> 
> 
>> On 28 Sep 2022, at 04:17, Dimitre Novatchev <dnovatchev@gmail.com <mailto:dnovatchev@gmail.com>> wrote:
>> 
>> 
>> 
>> On Tue, Sep 27, 2022 at 9:42 AM Norm Tovey-Walsh <norm@saxonica.com <mailto:norm@saxonica.com>> wrote:
>> 
>> 
>> Draft Minutes
>> 
>> Summary of new and continuing actions [0/7]
>> 
>>      * [ ] QT4CG-002-01: NW to incorporate email feedback and produce new
>>        versions of the process documents.
>>      * [ ] QT4CG-003-03: NW to tweak the CSS for function signatures to avoid
>>        line breaks on - characters.
>>      * [ ] QT4CG-002-10: BTW to coordinate some ideas about improving
>>        diversity in the group
>>      * [ ] QT4CG-004-01: MK (with DN and RD) to draft a new proposal for
>>        variadic functions
>>      * [ ] QT4CG-004-02: DN to make a proposal for deep-equal-safe for future
>>        discussion
>>      * [ ] QT4CG-004-03: MK to draft a pull request implementing
>>        fn:intersperse
>>      * [ ] QT4CG-004-04: DN to open an issue for the inverse of
>>        fn:intersperse
>> 
>> 
>> The description of the function fn:deep-equal-safe() is in a pdf file that can be found here:
>> 
>> https://github.com/dnovatchev/FXSL-XSLT2/blob/master/fn-deep-equal-safe.pdf <https://github.com/dnovatchev/FXSL-XSLT2/blob/master/fn-deep-equal-safe.pdf>
>> 
>> Note: this document is essentially a compilation from the FO 3.1 of:
>>       op:same-key  (https://www.w3.org/TR/xpath-functions-31/#func-same-key <https://www.w3.org/TR/xpath-functions-31/#func-same-key>),  and 
>>       fn:deep-equal (https://www.w3.org/TR/xpath-functions-31/#func-deep-equal <https://www.w3.org/TR/xpath-functions-31/#func-deep-equal>)
>> 
>> Special care was taken to substitute the fn:deep-equal semantics that either results in raising an error, or in possible intransitivity or context-dependency. All such behavior has been substituted with the corresponding behavior from op:same-key, which the 3.1 Spec claims to be: "deterministic, context-independent, and ·focus-independent", etc.
>> 
>> More specifically, to achieve this:
>> 
>> No errors are raised, instead false() is returned
>> 
>> Strings are compared without any dependency on collations (fn:codepoint-equal is used in such comparisons)
>> 
>> Not using eq but instead every instance of xs:double, xs:float and xs:decimal is represented exactly as a decimal number provided enough digits are available both before and after the decimal point.Unlike the eq  relation which converts both operands to xs:double values, possibly losing precision in the process, this comparison is transitive
>> 
>> fn:deep-equal is used in comparing values having a variety of date, time, year, month day types so that, unlike when using eq, no error is raised when comparing values of different types, but just fasle() is returned. Also, unlike when using the eq operator, this comparison has no dependency on implicit time-zone, meaning no dependency on this aspect of the dynamic context.
>> 
>> The goal of this function description is to serve as a starting point for discussion about possible options for fn:deep-equal.
>> 
>> Any comments will be appreciated.
>> 
>> Thanks,
>> Dimitre
> 
> 
> 
> -- 
> Cheers,
> Dimitre Novatchev
> ---------------------------------------
> Truly great madness cannot be achieved without significant intelligence.
> ---------------------------------------
> To invent, you need a good imagination and a pile of junk
> -------------------------------------
> Never fight an inanimate object
> -------------------------------------
> To avoid situations in which you might make mistakes may be the
> biggest mistake of all
> ------------------------------------
> Quality means doing it right when no one is looking.
> -------------------------------------
> You've achieved success in your field when you don't know whether what you're doing is work or play
> -------------------------------------
> To achieve the impossible dream, try going to sleep.
> -------------------------------------
> Facts do not cease to exist because they are ignored.
> -------------------------------------
> Typing monkeys will write all Shakespeare's works in 200yrs.Will they write all patents, too? :)
> -------------------------------------
> Sanity is madness put to good use.
> -------------------------------------
> I finally figured out the only reason to be alive is to enjoy it.

Received on Wednesday, 28 September 2022 14:22:46 UTC