Re: QT4CG meeting 069 draft minutes, 12 March 2024

Hi again,
I redefined the solution for  producing all  needed sort-keys with a single
key() function as shown  below.

The new solution is easy and straight-forward - it doesn't need any tricks
as the previous one.

The main idea is that the single key() function returns an array, whose
members are all the sort-keys to be used in the sort.

Here is the definition and implementation of *fn:ranks*:

let $ranks := function(
                $input as item()*,
                $key as function(item()) as array(*),
                $distinct-ranks as xs:boolean,
                $collation-input as xs:string?,
                $collation-keys as xs:string?
                ) as array(*)*
 {
  let $make-distinct := ( (distinct-values#2)[$distinct-ranks], fn($x,
$y){$x} )[1]
     , $compare-keys := fn($k1, $k2, $collation-keys) {deep-equal($k1, $k2,
$collation-keys)}
     , $sort-comparator := fn($arr1 as array(*), $arr2 as array(*),
$collation-keys)
                           {
                             let $sorted := array:sort(array{$arr1, $arr2},
$collation-keys)
                              return
                                if(deep-equal($arr1, $sorted(1))) then -1
                                  else 1
                           }
     , $make-distinct-arrays := fn($input as array(*)*, $collation-keys)
       {
         fold-left($input, (),
                      fn($accum, $inItem)
                      {
                        $accum, if(empty(filter($accum,
fn($accIt){deep-equal($accIt, $inItem)} ))) then $inItem else ()
                      }
                  )
       }
    return
      for $v in $make-distinct-arrays(sort-with($input ! $key(.),
 $sort-comparator(?, ?, $collation-keys)), $collation-keys)
        return
            array{$make-distinct($input[$compare-keys($key(.), $v,
$collation-keys)],  $collation-input)}


And here is a call (twice) to* fn:rank* passing to it a key() function that
(the first time) returns only the department of an employee, and in the
second call the key() function is defined to return an array of both the
department and the salary of the employee.:


        return
          let $employees := map{
            "John Smith": map{ "dept": "Sales", "salary": 50000},
            "Erin Carter": map{ "dept": "Computing", "salary": 120000},
            "Ryan Gosling": map{ "dept": "Sales", "salary": 100000},
            "Ann Gould": map{ "dept": "Computing", "salary": 150000},
            "Pete Lagard": map{ "dept": "Sales", "salary": 50000},
            "Jim Carter": map{ "dept": "Sales", "salary": 80000},
            "Greg Wilson": map{ "dept": "Computing", "salary": 120000}
          }
            return
            (
              $ranks(map:keys($employees),
fn($emp){[$employees($emp)("dept")]}, true(), (), ()),

"===============================================================================",

              $ranks(map:keys($employees),
fn($emp){[$employees($emp)("dept"), $employees($emp)("salary")]}, true(),
(), ())
            )

And the results:


["Ann Gould","Greg Wilson","Erin Carter"]
["John Smith","Ryan Gosling","Pete Lagard","Jim Carter"]
"==============================================================================="
["Greg Wilson","Erin Carter"]
["Ann Gould"]
["John Smith","Pete Lagard"]
["Jim Carter"]
["Ryan Gosling"]


[image: image.png]


Thanks,
Dimitre

On Tue, Mar 12, 2024 at 12:05 PM Dimitre Novatchev <dnovatchev@gmail.com>
wrote:

> >  DN agrees to demonstrate a single function that can take the place of
> >   several.
>
> Yes, any sequence of functions can be replaced by a single function.
>
> Here is one such example:
>
> We are given  a company's employees and each employee has a name,
> department and salary.
>
> We will rank the employees first just by department, then by both
> department and salary - done with a single function as specified in the 2nd
> call to *fn:ranks* below:
>
>   let $employees := map{
>     "John Smith": map{ "dept": "Sales", "salary": 50000},
>     "Erin Carter": map{ "dept": "Computing", "salary": 120000},
>     "Ryan Gosling": map{ "dept": "Sales", "salary": 100000},
>     "Ann Gould": map{ "dept": "Computing", "salary": 150000},
>     "Pete Lagard": map{ "dept": "Sales", "salary": 50000},
>     "Jim Carter": map{ "dept": "Sales", "salary": 80000},
>     "Greg Wilson": map{ "dept": "Computing", "salary": 120000}
>   }
>     return
>     (
>       ranks(map:keys($employees), fn($emp){$employees($emp)("dept")}),
>
> "===============================================================================",
>       ranks(map:keys($employees),
>              fn($emp){$employees($emp)("dept")
>                     || (let $sal := $employees($emp)("salary"),
>                            $salDigits := string-length(string($sal))
>                          return substring('0000000', $salDigits +1) ||
> string($sal)  )})
>     )
>
> Produces:
>
> ["Ann Gould","Greg Wilson","Erin Carter"]
> ["John Smith","Ryan Gosling","Pete Lagard","Jim Carter"]
>
> "==============================================================================="
> ["Greg Wilson","Erin Carter"]
> ["Ann Gould"]
> ["John Smith","Pete Lagard"]
> ["Jim Carter"]
> ["Ryan Gosling"]
>
> [image: image.png]
>
> On Tue, Mar 12, 2024 at 10:19 AM Norm Tovey-Walsh <norm@saxonica.com>
> wrote:
>
>> Hello folks,
>>
>> Here are the draft minutes from today’s meeting:
>>
>>    https://qt4cg.org/meeting/minutes/2024/03-12.html
>>
>> QT4 CG Meeting 069 Minutes 2024-03-12
>>
>> Table of Contents
>>
>>      * [1]Draft Minutes
>>      * [2]Summary of new and continuing actions [0/6]
>>      * [3]1. Administrivia
>>           + [4]1.1. Roll call [11/13]
>>           + [5]1.2. Accept the agenda
>>                o [6]1.2.1. Status so far...
>>           + [7]1.3. Approve minutes of the previous meeting
>>           + [8]1.4. Next meeting
>>           + [9]1.5. Review of open action items [8/12]
>>           + [10]1.6. Review of open pull requests and issues
>>                o [11]1.6.1. Blocked
>>                o [12]1.6.2. Merge without discussion
>>                o [13]1.6.3. Close without action
>>                o [14]1.6.4. Substantive PRs
>>      * [15]2. Technical Agenda
>>           + [16]2.1. Brief demo
>>           + [17]2.2. Diversion between the spec and test suite
>>           + [18]2.3. PR #1062/#1027: fn:ranks
>>           + [19]2.4. PR #1066: 1052 Simplify the results of parse-csv
>>           + [20]2.5. PR #1059: 1019 XQFO: Unknown option parameters
>>      * [21]3. Any other business
>>      * [22]4. Adjourned
>>
>>    [23]Meeting index / [24]QT4CG.org / [25]Dashboard / [26]GH Issues /
>>    [27]GH Pull Requests
>>
>> Draft Minutes
>>
>> Summary of new and continuing actions [0/6]
>>
>>      * [ ] QT4CG-052-02: NW to consider how to schedule an "editor's
>>        meeting"
>>      * [ ] QT4CG-063-04: NW to try to add test review to the editorial
>>        meeting.
>>      * [ ] QT4CG-063-06: MK to consider refactoring the declare item type
>>        syntax to something like declare record
>>      * [ ] QT4CG-064-08: NW to open an issue to try to resolve $search to
>>        $target consistently.
>>      * [ ] QT4CG-069-01: MK to list the remaining issues that need
>>        discussion.
>>      * [ ] QT4CG-069-02: NW to coordinate with MK to use the introspection
>>        features on the test suite.
>>
>> 1. Administrivia
>>
>> 1.1. Roll call [11/13]
>>
>>    Regrets SF.
>>      * [X] Reece Dunn (RD)
>>      * [ ] Sasha Firsov (SF) [-:30]
>>      * [X] Christian Gr¸n (CG)
>>      * [X] Joel Kalvesmaki (JK)
>>      * [X] Michael Kay (MK)
>>      * [X] Juri Leino (JLO)
>>      * [X] John Lumley (JLY)
>>      * [X] Dimitre Novatchev (DN)
>>      * [X] Wendell Piez (WP)
>>      * [X] Ed Porter (EP)
>>      * [ ] Adam Retter (AR)
>>      * [X] C. M. Sperberg-McQueen (MSM)
>>      * [X] Norm Tovey-Walsh (NW). Scribe. Chair.
>>
>> 1.2. Accept the agenda
>>
>>    Proposal: Accept [28]the agenda.
>>      * CG: I'd like to talk about divergence between the spec and the test
>>        suite.
>>
>>    Accepted, with that ammendment.
>>
>> 1.2.1. Status so far...
>>
>>    issues-open-2024-03-12.png
>>
>>    Figure 1: "Burn down" chart on open issues
>>
>>    issues-by-spec-2024-03-12.png
>>
>>    Figure 2: Open issues by specification
>>
>>    issues-by-type-2024-03-12.png
>>
>>    Figure 3: Open issues by type
>>
>> 1.3. Approve minutes of the previous meeting
>>
>>    Proposal: Accept [29]the minutes of the previous meeting.
>>
>>    Accepted.
>>
>> 1.4. Next meeting
>>
>>    The next meeting [30]is scheduled for Tuesday, 19 March 2024.
>>
>>    Any regrets for the next meeting?
>>
>>    None heard.
>>
>> 1.5. Review of open action items [8/12]
>>
>>      * [ ] QT4CG-052-02: NW to consider how to schedule an "editor's
>>        meeting"
>>      * [ ] QT4CG-063-04: NW to try to add test review to the editorial
>>        meeting.
>>      * [ ] QT4CG-063-06: MK to consider refactoring the declare item type
>>        syntax to something like declare record
>>      * [ ] QT4CG-064-08: NW to open an issue to try to resolve $search to
>>        $target consistently.
>>
>> 1.6. Review of open pull requests and issues
>>
>> 1.6.1. Blocked
>>
>>    The following PRs are open but have merge conflicts or comments which
>>    suggest they aren't ready for action.
>>      * PR [31]#956: 850-partial Editorial improvements to parse-html()
>>      * PR [32]#529: 528 fn:elements-to-maps
>>
>> 1.6.2. Merge without discussion
>>
>>    The following PRs are editorial, small, or otherwise appeared to be
>>    uncontroversial when the agenda was prepared. The chairs propose that
>>    these can be merged without discussion. If you think discussion is
>>    necessary, please say so.
>>      * PR [33]#1058: 1037 fn:json-to-xml: 'number-parser' option
>>
>>    Proposal: accept without discussion.
>>
>>    Accepted.
>>
>> 1.6.3. Close without action
>>
>>    It has been proposed that the following issues be closed without
>>    action. If you think discussion is necessary, please say so.
>>      * Issue [34]#961: Simulating Objects: Performance
>>      * Issue [35]#960: Should ??KS flatten the results
>>      * Issue [36]#829: fn:boolean: EBV support for more item types
>>      * Issue [37]#825: array:members-at
>>      * Issue [38]#757: Function families
>>      * Issue [39]#314: Basic Operations on Maps and Arrays
>>      * Issue [40]#295: Extend support for self-reference in record types
>>      * Issue [41]#274: What would it take/would it be possible to build a
>>        module repository for QT?
>>      * Issue [42]#262: Navigation in deep-structured arrays
>>      * Issue [43]#220: Encapsulation
>>
>>    Proposal: close without further action.
>>      * MK: I proposed closing some of these because the discussion hadn't
>>        lead to any clear course of action. Some have been overtaken by
>>        events. Some have been implemented.
>>      * NW: I think it makes sense to keep the list tidy; we can open them
>>        again.
>>
>>    Accepted.
>>
>>    Some discussion of the issue of flattening sequences. DN is concerned
>>    that flattening causes data loss and we should do something about that.
>>    The problem will continue to exist even if we close the issue!
>>
>> 1.6.4. Substantive PRs
>>
>>    The following substantive PRs were open when this agenda was prepared.
>>      * PR [44]#1068: 73 fn:graphemes
>>      * PR [45]#1066: 1052 Simplify the results of parse-csv
>>      * PR [46]#1062: 150bis - revised proposal for fn:ranks
>>      * PR [47]#1059: 1019 XQFO: Unknown option parameters
>>      * PR [48]#1027: 150 fn:ranks
>>      * PR [49]#832: 77 Add map:deep-update and array:deep-update
>>
>> 2. Technical Agenda
>>
>> 2.1. Brief demo
>>
>>    SF had to give regrets, we'll postpone this to next week.
>>
>> 2.2. Diversion between the spec and test suite
>>
>>      * CG: We have some features that have been added to the spec but not
>>        agreed.
>>
>>    ACTION QT4CG-069-01: MK to list the remaining issues that need
>>    discussion.
>>      * CG: In the beginning, the test suite was pretty easy to navigate.
>>        But now we have lots of tests for things that aren't in the
>>        specification. I have a growing list of things that I need to add
>>        to the test suite.
>>           + ... Before adding more features, it would be nice to tidy up
>>             the current test suite.
>>      * MK: There's a mechanism, the "covers 4.0 attribute" that we haven't
>>        been using as diligently as we might.
>>           + ... In theory the test suite has a list of features and tests
>>             can be tagged against those features.
>>           + ... Ideally, those tags should be PR numbers and we should
>>             change the tagging of tests to identify the PR number that
>>             they're associated with.
>>
>>    We can use PR tags to identify missing tests, accepted tests, etc.
>>      * MK: Incorrect tests we should manage with issues.
>>      * JLY: The one I encountered this morning is that there are tests for
>>        things about map keys that aren't in the spec.
>>      * NW: How do we make progress?
>>      * MK: There are introspective tests that test the test suite against
>>        the changes. We can try modifying the list of changes to match the
>>        PR numbers.
>>
>>    ACTION QT4CG-069-02: NW to coordinate with MK to use the introspection
>>    features on the test suite.
>>      * CG: For features that will probably be added, we should use PRs.
>>
>> 2.3. PR #1062/#1027: fn:ranks
>>
>>    See PR [50]#1062: 150bis - revised proposal for fn:ranks and PR
>>    [51]#1027: 150 fn:ranks
>>      * MK: My PR was an attempt to implement the things that I understood
>>        or that seemed uncontroversial.
>>           + ... I was saying "this is what I think the function should
>>             do."
>>
>>    Some discussion of how to proceed. DN proposes we review MK's draft.
>>      * MK reviews his draft (PR #1062).
>>           + ... I understood this to be essentially a group sort.
>>           + ... It's a sort followed by a partitioning, or vice-versa
>>           + ... The signature takes identical parameters to fn:sort but
>>             instead of delivering a list of items, it returns a list of
>>             arrays of items.
>>           + ... It doesn't allow you to do the partitioning independently
>>             from what the sort is doing, as the other proposal does.
>>      * RD: With DN's proposal, what additional flexibility would we get?
>>
>>    DN comments on MK's proposal.
>>      * DN: I think op:same-sort-keys() is a nice addition, but I don't
>>        think it's defined anywhere.
>>           + ... The order of arguments is problematic because it requires
>>             an empty () collation to be provided.
>>           + ... In the fifth example, we use the argument name keys but
>>             the argument is a single function. That's very confusing. What
>>             we need is a ranking function. The name key is unsatisfying.
>>      * DN: I'm also concerned about the fact that in MK's proposal the
>>        function argument isn't a single function, it's a sequence of
>>        functions!
>>
>>    DN switches to present his proposal, PR #1027.
>>      * DN: My function has arguments that are easier to use.
>>           + ... This function was borrowed from SQL and they don't care
>>             about the fact that items can occur more than once because
>>             they deal with sets. But we don't.
>>           + ... This is why the $distinct-ranks parameter is needed and
>>             defaults to true().
>>           + ... The collation only has to be used when it's required.
>>      * DN highlights the difference that $distinct-ranks makes.
>>      * DN: MK wants to use the same function arguments as fn:sort but I
>>        think that's unnecessary.
>>      * NW: How does the sequence of functions come into play?
>>
>>    DN makes a passionate argument for simplicity on behalf of the users.
>>      * RD: I think the sequence of functions is to support sorting by
>>        author then title, this is the reason fn:sort has multiple
>>        functions.
>>           + ... In fn:ranks if you wanted to sort by string-length and
>>             whether the length is odd or even, you'd need two functions.
>>             That's why you have multiple functions.
>>
>>    Some discussion of whether you can write a single function to do that.
>>      * RD: The function you pass isn't just a comparison function, it's
>>        used to select the keys.
>>
>>    Further discussion of whether or not it's even possible to write a
>>    single function for this purpose.
>>      * CG: Can you give an example, please, it's not clear.
>>      * JLO: Comparing both proposals, I see that one thing that bugged me
>>        was having to provide the empty sequence as the second argument to
>>        support.
>>           + ... If it's so problematic, creating a wrapper function isn't
>>             too problematic.
>>           + ... I do like functions in our specification to behave the
>>             same way.
>>           + ... If fn:sort and fn:ranks both need the collation, I would
>>             like it to be in the same place.
>>      * JLO: In DN's proposal, why are there two collations?
>>      * DN: The $collation-input is needed if the inputs are strings and
>>        $distinct-ranks is true. The collation is needed to make the input
>>        strings distinct.
>>
>>    Some discussion of the difference between fn:sort and fn:sort-with.
>>      * JLO: Can we get rid of all the collations that way?
>>      * CG: Did you consider comparitor functions?
>>      * DN: I think we need them to make the strings unique.
>>      * CG: But not if you use comparitor functions.
>>      * RD: Isn't one of the disadvantage of a comparitor function is that
>>        you can't hash the returned keys so you don't have to compute them
>>        every time. That makes it easier to build the ranked data
>>        structure.
>>      * CG: You can cache those in the comparitor case; the optimizations
>>        are different but it can be done.
>>
>>    DN agrees to demonstrate a single function that can take the place of
>>    several.
>>
>> 2.4. PR #1066: 1052 Simplify the results of parse-csv
>>
>>    See PR [52]#1066
>>      * MK: I don't think we can review the proposal this week.
>>      * NW: I'll make sure this is on the top of the agenda next week.
>>
>> 2.5. PR #1059: 1019 XQFO: Unknown option parameters
>>
>>    See PR [53]#1059
>>      * CG reviews his PR.
>>           + ... The fact that unknown options are ignored means that typos
>>             aren't detected.
>>           + ... One question is what we do about vendor extension options.
>>           + ... I think it would be best to reject any option that isn't
>>             known to the implementation.
>>           + ... Do we say you MUST raise an error or SHOULD raise an
>>             error.
>>      * MK: I think there are two issues: backwards compatibility. We'll
>>        find stylesheets that use misspelled names that didn't previously
>>        given an error. And vendor extensions: we may find users have
>>        deliberately used option names that they know are known to only one
>>        processor.
>>      * JLY: This is a case where it may be permitted to raise an error,
>>        but it should be user-configurable. There may be legitimate reasons
>>        to want to use options that aren't recognized.
>>      * MSM: What JL said.
>>      * DN: I think CG is right, it is always better to be notified about
>>        errors. What JL said also applies. But errors should be raised by
>>        default.
>>      * WP: I can see the value; apart from the question of "lint"
>>        checking, it would be nice if a common option could be provided,
>>        that could be useful.
>>      * JLO: If I WP right, it would be a "can be raised" but you'd define
>>        the error.
>>      * WP: There are operational advantages, but out on the edges, there
>>        may be cases where you want the current behavior.
>>      * RD: One of the challenges is that if you want to take advantage of
>>        vendor extensions, there's currently no mechanism to detect whether
>>        the version you're on supports a specific property.
>>           + ... I wonder if we could take advantage of records and have a
>>             "does this record support this property" check. Then you could
>>             check on the options for the function. That would provide a
>>             mechanism for validating incorrect parameters.
>>      * WP: So in the code, you could explicitly validate?
>>      * RD: Yes. You could say "if format in record type, then create a
>>        record with a format key." You could build the map up like that.
>>        That would let thing be more extensible. You wouldn't have to say
>>        "is there a vendor function and is the vendor version greater than
>>        some value", etc.
>>
>>    No obvious consensus has formed, we'll come back to this next week.
>>
>> 3. Any other business
>>
>>    None heard.
>>
>> 4. Adjourned
>>
>> References
>>
>>    1. https://qt4cg.org/meeting/minutes/2024/03-12.html#minutes
>>    2. https://qt4cg.org/meeting/minutes/2024/03-12.html#new-actions
>>    3. https://qt4cg.org/meeting/minutes/2024/03-12.html#administrivia
>>    4. https://qt4cg.org/meeting/minutes/2024/03-12.html#roll-call
>>    5. https://qt4cg.org/meeting/minutes/2024/03-12.html#agenda
>>    6. https://qt4cg.org/meeting/minutes/2024/03-12.html#so-far
>>    7. https://qt4cg.org/meeting/minutes/2024/03-12.html#approve-minutes
>>    8. https://qt4cg.org/meeting/minutes/2024/03-12.html#next-meeting
>>    9. https://qt4cg.org/meeting/minutes/2024/03-12.html#open-actions
>>   10.
>> https://qt4cg.org/meeting/minutes/2024/03-12.html#open-pull-requests
>>   11. https://qt4cg.org/meeting/minutes/2024/03-12.html#blocked
>>   12.
>> https://qt4cg.org/meeting/minutes/2024/03-12.html#merge-without-discussion
>>   13.
>> https://qt4cg.org/meeting/minutes/2024/03-12.html#close-without-action
>>   14. https://qt4cg.org/meeting/minutes/2024/03-12.html#substantive
>>   15. https://qt4cg.org/meeting/minutes/2024/03-12.html#technical-agenda
>>   16. https://qt4cg.org/meeting/minutes/2024/03-12.html#demo
>>   17. https://qt4cg.org/meeting/minutes/2024/03-12.html#test-suite
>>   18. https://qt4cg.org/meeting/minutes/2024/03-12.html#pr-1062
>>   19. https://qt4cg.org/meeting/minutes/2024/03-12.html#pr-1066
>>   20. https://qt4cg.org/meeting/minutes/2024/03-12.html#pr-1059
>>   21.
>> https://qt4cg.org/meeting/minutes/2024/03-12.html#any-other-business
>>   22. https://qt4cg.org/meeting/minutes/2024/03-12.html#adjourned
>>   23. https://qt4cg.org/meeting/minutes/
>>   24. https://qt4cg.org/
>>   25. https://qt4cg.org/dashboard
>>   26. https://github.com/qt4cg/qtspecs/issues
>>   27. https://github.com/qt4cg/qtspecs/pulls
>>   28. https://qt4cg.org/meeting/agenda/2024/03-12.html
>>   29. https://qt4cg.org/meeting/minutes/2024/03-05.html
>>   30. https://qt4cg.org/meeting/agenda/2024/03-19.html
>>   31. https://qt4cg.org/dashboard/#pr-956
>>   32. https://qt4cg.org/dashboard/#pr-529
>>   33. https://qt4cg.org/dashboard/#pr-1058
>>   34. https://github.com/qt4cg/qtspecs/issues/961
>>   35. https://github.com/qt4cg/qtspecs/issues/960
>>   36. https://github.com/qt4cg/qtspecs/issues/829
>>   37. https://github.com/qt4cg/qtspecs/issues/825
>>   38. https://github.com/qt4cg/qtspecs/issues/757
>>   39. https://github.com/qt4cg/qtspecs/issues/314
>>   40. https://github.com/qt4cg/qtspecs/issues/295
>>   41. https://github.com/qt4cg/qtspecs/issues/274
>>   42. https://github.com/qt4cg/qtspecs/issues/262
>>   43. https://github.com/qt4cg/qtspecs/issues/220
>>   44. https://qt4cg.org/dashboard/#pr-1068
>>   45. https://qt4cg.org/dashboard/#pr-1066
>>   46. https://qt4cg.org/dashboard/#pr-1062
>>   47. https://qt4cg.org/dashboard/#pr-1059
>>   48. https://qt4cg.org/dashboard/#pr-1027
>>   49. https://qt4cg.org/dashboard/#pr-832
>>   50. https://qt4cg.org/dashboard/#pr-1062
>>   51. https://qt4cg.org/dashboard/#pr-1027
>>   52. https://qt4cg.org/dashboard/#pr-1066
>>   53. https://qt4cg.org/dashboard/#pr-1059
>>
>>                                         Be seeing you,
>>                                           norm
>>
>> --
>> Norm Tovey-Walsh
>> Saxonica
>>
>
>
>

Received on Wednesday, 13 March 2024 20:18:50 UTC