QT4 CG Draft Minutes 032, 25 April 2023 from Norm Tovey-Walsh on 2023-04-25 (public-xslt-40@w3.org from April 2023)

From: Norm Tovey-Walsh <norm@saxonica.com>
Date: Tue, 25 Apr 2023 17:22:56 +0100
To: public-xslt-40@w3.org
Message-ID: <m2pm7s7wth.fsf@saxonica.com>
Hello,

Here are the draft minutes from today’s call.

   https://qt4cg.org/meeting/minutes/2023/04-25.html

QT4 CG Meeting 032 Minutes 2023-04-25

Table of Contents

     * [1]Draft Minutes
     * [2]Summary of new and continuing actions [0/12]
     * [3]1. Administrivia
          + [4]1.1. Roll call [10/13]
          + [5]1.2. Accept the agenda
               o [6]1.2.1. Status so far...
          + [7]1.3. Approve minutes of the previous meeting
          + [8]1.4. Next meeting
          + [9]1.5. Review of open action items [2/8]
     * [10]2. Technical Agenda
          + [11]2.1. PR #433: Allow hex and binary literals and allow
            underscores
          + [12]2.2. PR #434: Functions to parse and format hex integers
          + [13]2.3. Issue #359: fn:void: Absorb result of evaluated
            argument
     * [14]3. Adjourned

Draft Minutes

Summary of new and continuing actions [0/12]

     * [ ] QT4CG-002-10: BTW to coordinate some ideas about improving
       diversity in the group
     * [ ] QT4CG-016-08: RD to clarify how namespace comparisons are
       performed.
     * [ ] QT4CG-026-01: MK to write a summary paper that outlines the
       decisions we need to make on "value sequences"
          + This is related to PR #368: Issue 129 - Context item
            generalized to context value and subsequent discussion.
     * [ ] QT4CG-029-01: RD+DN to draft spec prose for the "divide and
       conquer" approach outlined in issue #399
     * [ ] QT4CG-029-07: NW to open the next discussion of #397 with a
       demo from DN See PR [15]#449
     * [ ] QT4CG-031-03: CG to draft a PR to address issue #410
     * [ ] QT4CG-032-01: NW to make sure open PRs are on the agendas in
       future
     * [ ] QT4CG-032-02: MK to adjust the grammar in #433 per CGs
       suggestion.
     * [ ] QT4CG-032-03: MK to change 32 to 36 in 4.5.2 fn:parse-integer
     * [ ] QT4CG-032-04: JK to suggest an example for base 26.
     * [ ] QT4CG-032-05: MK to check that the terminology in format number
       isn't too biased towards decimal
     * [ ] QT4CG-032-06: MK to a compatibility note about the use of ^ in
       format number.

1. Administrivia

1.1. Roll call [10/13]

   Regrets BTW.
     * [ ] Anthony (Tony) Bufort (AB)
     * [X] Reece Dunn (RD)
     * [X] Sasha Firsov (SF)
     * [X] Christian Gr¸n (CG)
     * [X] Joel Kalvesmaki (JK) [0:06-]
     * [X] Michael Kay (MK)
     * [X] John Lumley (JL)
     * [X] Dimitre Novatchev (DN)
     * [X] Ed Porter (EP)
     * [X] C. M. Sperberg-McQueen (MSM)
     * [ ] Bethan Tovey-Walsh (BTW)
     * [X] Norm Tovey-Walsh (NW). Scribe. Chair.

1.2. Accept the agenda

   Proposal: Accept [16]the agenda.

   Accepted.

1.2.1. Status so far...

   issues-open-2023-04-25.png

   Figure 1: "Burn down" chart on open issues

   issues-by-spec-2023-04-25.png

   Figure 2: Open issues by specification

   issues-by-type-2023-04-25.png

   Figure 3: "Burn down" chart on open issues

1.3. Approve minutes of the previous meeting

   Proposal: Accept [17]the minutes of the previous meeting.

   Accepted.

1.4. Next meeting

   The next meeting [18]is scheduled for Tuesday, 2 May 2023.

   No regrets heard.

   ACTION QT4CG-032-01: NW to make sure open PRs are on the agendas in
   future

1.5. Review of open action items [2/8]

     * [ ] QT4CG-002-10: BTW to coordinate some ideas about improving
       diversity in the group
     * [ ] QT4CG-016-08: RD to clarify how namespace comparisons are
       performed.
     * [ ] QT4CG-026-01: MK to write a summary paper that outlines the
       decisions we need to make on "value sequences"
          + This is related to PR #368: Issue 129 - Context item
            generalized to context value and subsequent discussion.
     * [ ] QT4CG-029-01: RD+DN to draft spec prose for the "divide and
       conquer" approach outlined in issue #399
     * [ ] QT4CG-029-07: NW to open the next discussion of #397 with a
       demo from DN
     * [X] QT4CG-031-01: MK to update map:of to have more complete
       examples See PR [19]#449
     * [X] QT4CG-031-02: MK to make the map options into definitions. See
       PR [20]#449
     * [ ] QT4CG-031-03: CG to draft a PR to address issue #410

2. Technical Agenda

   Once again, this week's agenda mostly continues where we left off last
   week. I've moved a couple of hopefully easy PRs to the top of the list.

2.1. PR #433: Allow hex and binary literals and allow underscores

   See PR [21]#433

   MK walks us through the issue.
     * MK: The change is quite modest, but there are a few small
       questions.
          + ... The change is essentially in literals; we introduce two
            new forms of numeric literals.
          + ... Change for digits production is to allow underscores.
          + ... What rules should we apply? Underscores at the beginning
            or end or adjacent?
     * CG: I gave some examples for Java and JavaScript
          + ... (Looking at a comment on the PR)
     * NW: I think we should avoid them at the beginning.
     * MK: I like CG's version of the grammar.
     * MSM: The only case I can imagine for wanting an underscore at the
       end is if I'm aligning several lines of numeric constants.
     * DN: What's the purpose of using underscore?
     * MK: When you have long numbers, like one trillion, it helps you
       count the number of digits.
     * MSM: Because you can't use "," or "." without confusing half of the
       world.
     * RD: Different languages can use different symbols; C++ uses '.

   ACTION QT4CG-032-02: MK to adjust the grammar in #433 per CGs
   suggestion.

   Proposal: Accept this PR

   Accepted.

2.2. PR #434: Functions to parse and format hex integers

   See PR [22]#434

   MK reviews the PR.
     * MK: This primarily changes the F&O spec.
          + ... The first change is for parsing integers in different
            radixes.
          + ... We have fn:parse-integer with a radix that defaults to 10.
          + ... It doesn't express formally what the result is; we just
            assume it's obvious.
     * RD: We're not using this to potentially be able to parse things
       like roman numerals and things.
     * MK: No, and it's ASCII digits only.
     * RD: Counter styles has a good discussion of this:
       [23]https://www.w3.org/TR/css-counter-styles-3/
     * MSM: Is dotted uppercase I the equivalent of uppercase I?

   Some discussion of how much precision we need to apply. If it's not
   obvious, then we probably ought to spell it out in horrid detail.
     * DN: I was expecting to see examples with underscores in them. If
       users will type them, it will be convenient to allow them here. I'm
       guessing that there's no limit on the size of the integer or the
       length of the strings.
     * MK: On the first point, I thought the use cases is for reading
       documents in a variety of formats. If the format does allow
       underscores, it's easier to strip them out.
          + ... I deliberately chose not to on the grounds that a typical
            input document that you're reading (a color attribute in a CSS
            document) is going to require some preprocessing and it's easy
            enough to strip them out.
          + ... On the second point, there's an error condition if it's
            too big.

   Some discussion of how many digits are required/allowed.

   ACTION QT4CG-032-03: MK to change 32 to 36 in 4.5.2 fn:parse-integer
     * DN: Maybe it could be an alternative to have a second argument to
       allow underscores. Otherwise it's a little bit confusing.
     * MK: I do think the use cases are very different.
     * NW: I don't see how throwing away underscores would be more
       difficult.
     * MK: I guess we could.
     * JK: I think we should support underscores here. One thing I'm going
       to be doing is parsing XSLT. The one thing I think is missing is
       base64binary, but obviously you can just cast that and manipulate
       it. It would also be helpful to have examples that demonstrate the
       base 26 case which usually only uses letters.
     * MK: Where is it used?
     * JK: In our workshop, we use aaa for 0, etc. And we have files with
       these extensions.
     * RD: Excel indexes are also done this way.

   ACTION QT4CG-032-04: JK to suggest an example for base 26.
     * JL: I tend to agree with MK that keeping this just acting on the
       digits is the best thing. There are all sorts separators could be
       used. I suggest we add a note that specifically says that we
       anticipate that other separators will be stripped out.
     * NW: That also sounds reasonable to me.
     * SF: We could have an optional parameter that specifies ignored
       characters. You could ignore hash for example or apostrophes, etc.
          + ... It would also be nice to be able to specify the characters
            in the alphabet. That would make for a wider use case.
     * MK: My feeling is that the design principle is that functions
       should do one thing and do it well. You can handle both of those
       use cases by combining this function with translate. Rather than
       having one function do too much.
     * RD: I was going to say what SF said.
     * SF: An example that uses translate would be good.
     * MK: Yes.

   MK moves on to formatting numbers.
     * MK: I've extended the fn:format-integer to accept the radix in the
       format string.
          + ... The picture string begins with (base)^, so "2^" for
            binary, "16^" for hex.
          + ... That's almost backwards compatible except in some very
            rare cases.
          + ... It requires a fair bit of generalization in the text that
            may not be perfect yes.
     * RD: What are the compatibility issues?
     * MK: If you used the "^" as the grouping separator, you're allowed
       to repeat the digits. So "16^16" would mean output the number with
       the grouping charater "^" every two characters.

   Some discussion about whether there's a note about the incompatibility
   or not.

   ACTION QT4CG-032-05: MK to check that the terminology in format number
   isn't too biased towards decimal

   ACTION QT4CG-032-06: MK to a compatibility note about the use of ^ in
   format number.

   Proposal: Accept this PR:

   Accepted.

2.3. Issue #359: fn:void: Absorb result of evaluated argument

   See Issue [24]#359
     * CG: If you have functions that have side effects or if you aren't
       interested in the results, then you have a case where you aren't
       interested in the result, but only if the code is executed.
          + ... People often try to circumvent the problem that functions
            always return something.
          + (There are examples in the issue.)
          + ... One solution is to have a fn:void() function that
            evaluates code and always returns an empty sequence.
          + ... It's not really settled what happens with
            non-deterministic code or side-effects.
          + ... My suggestion is that we enforce the evaluation of the
            argument whenever it's non-deterministic, but an
            implementation might otherwise be free not to evaluate it.
          + ... But I'm not sure.
     * DN: I think that it's a good proposal. I have a number of comments,
       but my main concern is that this function fn:void seems a little
       confusing. What's wanted here is eager evaluation of the argument.
       So we could call it fn:eager. This is a very special function, the
       optimizer must not discard it! Using the determinism of the
       argument may be difficult.
     * CG: I think your use cases are interesting, but I definitely think
       it's a different use case. Here it's really about ignoring the
       result. I think we should talk about eager and lazy functions in a
       different context.
     * DN: Okay, but we need to be clear about evaluation of the
       arguments.
     * RD: Eager evaluation vs lazy evaluation is different than what this
       is proposing. I pointed this out in the examples on the issue. For
       a database query, for example, you always want the result whether
       it was eager or lazy, but this is explicitly about discarding the
       result.
          + ... Consider 1 => fn:void(), 2 => fn:eager(); it should return
            2 not (1,2).
          + ... In a nested loop, lazy evaluation could evaluate each item
            multiple times.
     * JL: I think DN is right that this function has to be treated
       specially by the compilers. It's signature says it produces a
       constant result, so the optimizer could easily just replace all the
       calls.
     * MSM: I'm having trouble getting my head around the proposal and the
       discussion.
          + ... First, I think it's true that lazy vs eager is not the
            same as I care about the result or discard it. The use case
            that makes sense to me is that I have some code that I want to
            evaluate for the side effects.
          + ... The second thing that bothers me is when you're writing a
            script that calls programs, you must always check the return
            code. Because if you don't care about the result, why are you
            calling the program in the first place.
     * CG: In our cases, we have users that use modules that are written
       by other people. Some raise errors, some return errors. Sometimes
       all you care about is whether the code parses or not. Or sometimes
       you might get a jobid and maybe you don't care about it.
     * MSM: Ok. Earlier, you talked about wrapping a function call in void
       in order to comment it out. But when I do that, I expect them not
       to be executed. But that's not quite the same here.
     * CG: Right. That's something we should clarify that. I think it
       depends a lot on the implementations. For BaseX, we only evaluate
       non-deterministic code. But we could say always for fn:void.
     * DN: I wanted to say what MSM said. It seems bad to ignore the
       return code.
          + ... This is a special case of a special case. I think that we
            have a general case that is more important: that's eager vs.
            lazy evaluation.
     * RD: I wonder if we've got a slight difference in terminolgy here.
       If I understand DN and MSM correctly, "eager" means that at the
       point when the function is called the expression gets evaluated to
       some value. Whereas if you don't use that, the processor is able to
       defer the evaluation to some point in the future. What I'm
       referring to is the value is computed but is returning a generator
       function that generates the values on demand. That being "lazy"
       evaluation and "eager" computes them all.
     * DN: When we have "eager" we don't care internally how it works.
     * CG: Do you always get the result without eager?

   Some discussion of what happens to return values.
     * MK: This is all about forcing evaluation of things that have side
       effects. We have a long history of trying to do that with
       xsl:result-document. It proved fairly troublesome over the years
       because we don't really have order-of-execution semantics to
       underpin it. It's been made workable mainly by constraining the
       places where you can use it.
          + ... Assigning an expression to variable that isn't used isn't
            going to be solved by calling fn:void on the right hand side.
          + ... In XSLT, we only allow it at the top level so it's
            implicitly a sequential evaluation.
          + ... I don't think we can use fn:void() anywhere in an
            expression without having a lot more work on the semantics.
     * CG: I assume that functions like file:write are implemented
       differently in Saxon and BaseX. The challenge is the same but the
       solutions are different.
     * MK: If the user puts them where the optimizer can muck with them,
       then they may not get the results the expect. If you put file:write
       inside a predicate or sort key, it's going to be pretty
       unpredictable what happens.

3. Adjourned

   None heard.

References

   1. https://qt4cg.org/meeting/minutes/2023/04-25.html#minutes
   2. https://qt4cg.org/meeting/minutes/2023/04-25.html#new-actions
   3. https://qt4cg.org/meeting/minutes/2023/04-25.html#administrivia
   4. https://qt4cg.org/meeting/minutes/2023/04-25.html#roll-call
   5. https://qt4cg.org/meeting/minutes/2023/04-25.html#agenda
   6. https://qt4cg.org/meeting/minutes/2023/04-25.html#h-C1590AE6-AA6D-49E9-A040-5006E92C0784
   7. https://qt4cg.org/meeting/minutes/2023/04-25.html#approve-minutes
   8. https://qt4cg.org/meeting/minutes/2023/04-25.html#next-meeting
   9. https://qt4cg.org/meeting/minutes/2023/04-25.html#open-actions
  10. https://qt4cg.org/meeting/minutes/2023/04-25.html#technical-agenda
  11. https://qt4cg.org/meeting/minutes/2023/04-25.html#pr-433
  12. https://qt4cg.org/meeting/minutes/2023/04-25.html#pr-434
  13. https://qt4cg.org/meeting/minutes/2023/04-25.html#iss-359
  14. https://qt4cg.org/meeting/minutes/2023/04-25.html#adjourned
  15. https://qt4cg.org/dashboard/#pr-449
  16. https://qt4cg.org/meeting/agenda/2023/04-25.html
  17. https://qt4cg.org/meeting/minutes/2023/04-18.html
  18. https://qt4cg.org/meeting/agenda/2023/05-02.html
  19. https://qt4cg.org/dashboard/#pr-449
  20. https://qt4cg.org/dashboard/#pr-449
  21. https://qt4cg.org/dashboard/#pr-433
  22. https://qt4cg.org/dashboard/#pr-434
  23. https://www.w3.org/TR/css-counter-styles-3/
  24. https://github.com/qt4cg/qtspecs/issues/359

                                        Be seeing you,
                                          norm

--
Norm Tovey-Walsh
Saxonica
Received on Tuesday, 25 April 2023 16:26:12 UTC