QT4CG meeting 142 draft minutes, 18 November 2025

Hi folks,

Here are the draft minutes from today’s meeting:

   https://qt4cg.org/meeting/minutes/2025/11-18.html

QT4 CG Meeting 142 Minutes 2025-11-18

   [1]Meeting index / [2]QT4CG.org / [3]Dashboard / [4]GH Issues / [5]GH
   Pull Requests

Table of Contents

     * [6]Draft Minutes
     * [7]Summary of new and continuing actions [0/4]
     * [8]1. Administrivia
          + [9]1.1. Roll call [9/10]
          + [10]1.2. Accept the agenda
          + [11]1.3. Approve minutes of the previous meeting
          + [12]1.4. Next meeting
          + [13]1.5. Review of open action items [0/2]
          + [14]1.6. Review of open pull requests and issues
               o [15]1.6.1. Blocked
               o [16]1.6.2. Merge without discussion
               o [17]1.6.3. Close without action
     * [18]2. Technical agenda
          + [19]2.1. PR #2246: 2233 Expand xsl:analyze-string; introduce
            fn:regex-groups()
          + [20]2.2. PR #2295: 2294 Clarify semantics of `element(N,
            xs:anyType)`
          + [21]2.3. PR #2289: 2195 (partial) Editorial notes
            (incremental)
          + [22]2.4. PR #2286: 2279 fn:string-length#1,
            fn:normalize-space#1: accept xs:anyAtomicType
          + [23]2.5. PR #2285: 2198 Add pi-for-cdata parameter
          + [24]2.6. PR #2282: 2278 Add function bin:infer-encoding;
            simplify bin:decode-string
     * [25]3. Any other business

Draft Minutes

Summary of new and continuing actions [0/4]

     * [ ] QT4CG-140-02: MK to add a note about dealing with binary in
       parse-cvs and parse-json
     * [ ] QT4CG-141-01: MK to follow up on a comment by JWL on #2269
     * [ ] QT4CG-142-01: MK to review the "Captured Groups within
       Lookahead" example.
     * [ ] QT4CG-142-02: MK to add explanatory note about the difference
       between typed an untyped values in string-length

1. Administrivia

1.1. Roll call [9/10]

   Regrets: EP
     * [X] David J Birnbaum (DB)
     * [X] Reece Dunn (RD)
     * [X] Christian Gr¸n (CG)
     * [X] Joel Kalvesmaki (JK)
     * [X] Michael Kay (MK)
     * [X] Juri Leino (JLO)
     * [X] John Lumley (JWL)
     * [X] Wendell Piez (WP)
     * [ ] Ed Porter (EP)
     * [X] Norm Tovey-Walsh (NW) Scribe. Chair.

1.2. Accept the agenda

   Proposal: Accept [26]the agenda.

   Accepted.

1.3. Approve minutes of the previous meeting

   Proposal: Accept [27]the minutes of the previous meeting.

   Accepted.

1.4. Next meeting

   The next meeting is planned for 25 November 2025.

   Regrets: EP, JLO

1.5. Review of open action items [0/2]

     * [ ] QT4CG-140-02: MK to add a note about dealing with binary in
       parse-cvs and parse-json
     * [ ] QT4CG-141-01: MK to follow up on a comment by JWL on #2269

1.6. Review of open pull requests and issues

   This section summarizes all of the issues and pull requests that need
   to be resolved before we can finish. See [28]Technical Agenda below for
   the focus of this meeting.

1.6.1. Blocked

   The following PRs are open but have merge conflicts or comments which
   suggest they aren't ready for action.
     * PR [29]#2256: 2216 All atomic types become ordered
     * PR [30]#2247: Deferred Evaluation in XPath - the f:generator record
     * PR [31]#2160: 2073 data model changes for JNodes and Sequences
     * PR [32]#2124: 573 Functions to Construct Trees
     * PR [33]#2071: 77c deep update
     * PR [34]#2019: 1776: XSLT template rules for maps and array

1.6.2. Merge without discussion

   The following PRs are editorial, small, or otherwise appeared to be
   uncontroversial when the agenda was prepared. The chairs propose that
   these can be merged without discussion. If you think discussion is
   necessary, please say so.
     * PR [35]#2293: Updated RELAX NG grammar for XSLT 4.0 stylesheets
     * PR [36]#2290: Updated schema for XSLT 4.0 stylesheets

   Proposal: accept without discussion.

   Accepted.

1.6.3. Close without action

   It has been proposed that the following issues be closed without
   action. If you think discussion is necessary, please say so.
     * Issue [37]#2252: Dynamic XPath Evaluation the functional way
     * Issue [38]#1618: Adaptive serialization: doubles
     * JWL: Discussion of #2252 just sort of tailed off...
     * MK: Yes, I thought we should drop it not because it isn't doable,
       but to limit our ambitions.
     * JLO: I think it would be good to have such a function because
       almost all XQuery implementations I know already have that
       function.
     * CG: I have concerns that it may be too implementation-specific as
       it revolves around optimizing code. I think there were two
       proposals in this discussion.
          + One was a compiled instance that you could reuse,
          + The other was to compile a string into a function item.
     * MK: Let's leave #2252 open for the moment then.
     * CG: I think there have been other discussions about evaluating
       XPath
          + This is more about optimizing
     * MK: This is primarily about the functionality.

   Proposal: close #1618 with no further action.

   Accepted.

2. Technical agenda

2.1. PR #2246: 2233 Expand xsl:analyze-string; introduce fn:regex-groups()

   See PR [39]#2246
     * MK: Now that regular expressions can match a zero length string,
       you can find all the word boundaries in an input string. The
       fn:regex-group function turns out to be somewhat inadequate; it
       returns the string but not where it found it.
          + ... This proposes a fn:regex-groups function that returns the
            string and the matches.
          + ... There are now "string segments" that are a combination of
            a substring and its position in the input.
          + ... We now operate on a non-overlapping sequence of segments.
          + ... Within a matching substring, you can access the groups.
          + ... This distinguishes between matches on zero length strings
            and failure to match
          + ... The fn:regex-group function is defined in terms of the new
            function, for backwards compatibility.
     * JWL: I'd be interested in a comparison between this and the
       fn:analyze-string function that produces the same sort of output.
          + ... Is there something to be said here about where one might
            be preferable?
     * MK: I think we enhanced fn:analyze-string function to account for
       zero-length string matches.
          + ... But I'm not sure it does it quite as well.
     * JWL: In one case you get a map and in another you get elements.
       Might be worth saying something here.
     * JK: I think this is really nice. The definition says that the
       string segments are non-overlapping. What about look-ahead groups?
     * MK: The groups can overlap, but the matched segments don't.
     * JK: In the example, there are two cyan colored ones. In the first
       one, I don't understand the select.
     * MK: Yes...that is clearly gone wrong somewhere.

   ACTION QT4CG-142-01: MK to review the "Captured Groups within
   Lookahead" example.

   Proposal: Accept this PR.

   Accepted.

2.2. PR #2295: 2294 Clarify semantics of `element(N, xs:anyType)`

   See PR [40]#2295
     * MK: This is purely editorial and it's a bug fix, because anyType is
       a supertype of untyped.
          + And an attempt to clarify a few things.
     * JLO: What is the test of xs:anyType (without the question mark)?
     * MK: That matches anything that hasn't been nilled.
     * JLO: What would xs:untyped? mean?
     * MK: It's allowed by the grammar but it doesn't effect the meaning
       because an untyped thing can never be nilled.

   Proposal: Accept this PR.

   Accepted.

2.3. PR #2289: 2195 (partial) Editorial notes (incremental)

   See PR [41]#2289
     * MK: In Safari the arrows look horrible. The arrows are very
       different.
     * CG: I'll take another look at the arrows.
     * CG: But there are some other changes:
          + There's some changes to the summary of changes text.
     * MK: That's all fine.
     * CG: Some of the examples in the binary module didn't work for me so
       I made a few changes.
          + ... Mostly it's about changing formatting.
     * CG: I think the code snippet for bin:shift is still buggy, but I
       can try to fix it.
     * JWL: If the arrows are sufficiently variable across the browser,
       should we make them images.

2.4. PR #2286: 2279 fn:string-length#1, fn:normalize-space#1: accept
xs:anyAtomicType

   See PR [42]#2286
     * CG: There has been a lot of discussion. Liam noted that
       string-length() and string-length(.) do different things.
          + ... Without an argument, the context item is "stringified".
          + ... But with an argument, it can only take strings.
     * CG: But because of typed nodes, this not as easy as I thought.
     * CG: We could make the item type xs:anyAtomicType instead of
       xs:string.
          + ... With this change, string length and normalize space would
            be similar to other functions that accept the context item as
            the first item.
     * JLO: I'm in favor. I think there's a slight chance of misalignment
       because . could be an array. Don't they get atomized?
          + ... But xs:anyAtomicType doesn't allow that.
     * CG: Arrays are going to be atomized. If that produces more than one
       string, then you'll get an error. I don't think that changes.
     * JLO: I can pass that in even with xs:anyAtomicType?
     * MK: Yes, I think so.
     * JLO: Why is that in string length?
     * MK: That's for typed nodes. If you pass in a name surrounded by
       spaces, the current spec (since XPath 2.0) says that the string
       length is the length of the string value of the node not the typed
       value. Those can be different.
          + ... We're retaining that compatibility.

   Proposal: Accept this PR.

   Accepted.

   ACTION QT4CG-142-02: MK to add explanatory note about the difference
   between typed an untyped values in string-length

2.5. PR #2285: 2198 Add pi-for-cdata parameter

   See PR [43]#2285
     * MK: This is a PR that just changes serialization, before going
       through all the other specs.
     * MK: This adds a new PI for CDATA section parameter that names a PI.
          + ... The requirement was someone who wanted to generate only
            CDATA sections where they're needed.
          + ... I generalized that requirement to generate an arbitrary
            CDATA section anywhere a PI can occur.
     * NW: But putting data in a PI causes it to be lost by most
       down-stream processing.
          + ... I strongly object.
     * JK: I don't understand the name of the option. It sounds like a
       campaign slogan.
     * MK: It was intended to be an abbreviation for this the name of a
       processing instruction that you use for inserting cdata.
     * CG: When I first saw this issue, I had other use cases in mind.
          + ... People want to have all text that uses special characters
            encoded.
          + ... I'd prefer a solution that lets you do that dynamically.
          + ... So maybe we have to use cases here, one where I want text
            explicitly escaped and another where I want that done more
            automatically.
     * WP: This cool in that it addresses a real requirement. I share NW's
       hesitation. I think there are a couple of other possibilities. NW's
       suggestion of a preceding PI would work. The other thing I wonder
       is, maybe there's a way to flag an element with an attribute in
       namespace.
     * MK: You could maybe do something like we did for disable output
       escaping.
     * WP: I think there are a couple of options. I think NW's concerns
       are well founded.
     * MK: Let's try that one, putting an attribute on xsl:text that says
       CDATA section.
     * RD: It probably makes sense to have that be an XPath expression so
       you can call a function that makes that decision.
     * MK: Interesting.
     * JWL: RD's suggestion would involve evaluating the function against
       the result.
     * RD: Yes, you'd get the string from the xsl:text or xsl:value-of and
       then pass that to the XPath expression.
     * JWL: The default value would be true() but you could put in a
       function.
     * MK: I think I can run with this.

2.6. PR #2282: 2278 Add function bin:infer-encoding; simplify
bin:decode-string

   See PR [44]#2282
     * MK: I was becoming unhappy with the complexity of fn:decode-string.
       The interaction of skipping byte order marks and specifying offsets
       was getting very confusing.
          + ... I wondered if we could get a cleaner design?
     * MK: What I've proposed is a function fn:infer-encoding that returns
       an encoding and an offset where the real data starts.
          + ... So it might return "UTF-8" for the encoding and offset 3
            if there's a BOM.
          + ... In particular it addresses the case where the binary is
            embedded in another stream.
          + ... It simplifies bin:decode-string by saying it's UTF-8 if
            you don't specify it.
     * CG: I think the changes technically clean, but I have some concerns
       that where we started from was that we just had simple functions.
       One of my ideas was to try to make it easy for users who have maybe
       never heard of BOMs to be able to process data.
          + ... It's nice, but I would have liked a simpler solution for
            those users.
     * MK: We have fn:json-doc and fn:csv-doc and those do try to work out
       the encoding for you.
          + ... But if you break it into steps, we now provide clean
            primitives.
     * CG: So you can write it to disk and read it back, but it's probably
       easier to have a single function.
          + ... Web applications and things like RESTXQ where you don't
            want to decode the data on the fly.
     * JWL: In fn:decode-string, my concern is that if I want to decode
       the string, if I don't know what it is, isn't there an argument
       saying that the default behavior should be to do an implicit infer?
     * MK: I think you end up with non-intutive behavior that way.
       Changing the offset from 0 to 1 changes the behavior that has
       nothing to do with the offset.

   Some discussion of what the defaults should be and how they should
   interact.
     * JWL: If you only pass in a string, the fn:decode-string function
       should basically call the infer function. And if you provide an
       offset, that's an offset in the real data.
          + ... If I just want to decode a string and I don't know
            anything about BOM, do I have to check?
     * MK: We could make it so that if there's only one argument supplied
       it does that logic.
     * JWL: Does the offset I ask for in fn:decode-string include the
       chopped off BOM.
     * JLO: The binary module for me is always just a hex viewer. I want
       to be able to get each byte that I was given.
          + ... I can see the possibility that I don't want an inferred
            offset.
          + ... I just want to see the raw data; or I really want to skip
            them but I don't want to infer anything from them.
          + ... And I can still see CG's argument that we'd like it to be
            simple.
     * CG: In principle, I agree with JWL. I think it would be nice for
       the function to have a default behavior to infer the encoding.
          + ... One other option would be to remove the offset and size.
            The question is when does it make sense to read data when you
            don't know the encoding.
     * MK: Indeed, that problem is at the heart of this. Even if you have
       UTF-8 data, the specifying a start and byte position doesn't make
       much sense given that the characters vary in length.
          + ... You might have a message format that says how many octets
            each segment is. But...

   Out of time. We'll return to this next week.

3. Any other business

   JWL: We're going to try to put the tutorial material on the QT4CG
   website.

   General nods of approval.

References

   1. https://qt4cg.org/meeting/minutes/
   2. https://qt4cg.org/
   3. https://qt4cg.org/dashboard
   4. https://github.com/qt4cg/qtspecs/issues
   5. https://github.com/qt4cg/qtspecs/pulls
   6. https://qt4cg.org/meeting/minutes/2025/11-18.html#minutes
   7. https://qt4cg.org/meeting/minutes/2025/11-18.html#new-actions
   8. https://qt4cg.org/meeting/minutes/2025/11-18.html#administrivia
   9. https://qt4cg.org/meeting/minutes/2025/11-18.html#roll-call
  10. https://qt4cg.org/meeting/minutes/2025/11-18.html#agenda
  11. https://qt4cg.org/meeting/minutes/2025/11-18.html#approve-minutes
  12. https://qt4cg.org/meeting/minutes/2025/11-18.html#next-meeting
  13. https://qt4cg.org/meeting/minutes/2025/11-18.html#open-actions
  14. https://qt4cg.org/meeting/minutes/2025/11-18.html#open-pull-requests
  15. https://qt4cg.org/meeting/minutes/2025/11-18.html#blocked
  16. https://qt4cg.org/meeting/minutes/2025/11-18.html#merge-without-discussion
  17. https://qt4cg.org/meeting/minutes/2025/11-18.html#close-without-action
  18. https://qt4cg.org/meeting/minutes/2025/11-18.html#technical-agenda
  19. https://qt4cg.org/meeting/minutes/2025/11-18.html#pr-2246
  20. https://qt4cg.org/meeting/minutes/2025/11-18.html#pr-2295
  21. https://qt4cg.org/meeting/minutes/2025/11-18.html#pr-2289
  22. https://qt4cg.org/meeting/minutes/2025/11-18.html#pr-2286
  23. https://qt4cg.org/meeting/minutes/2025/11-18.html#pr-2285
  24. https://qt4cg.org/meeting/minutes/2025/11-18.html#pr-2282
  25. https://qt4cg.org/meeting/minutes/2025/11-18.html#any-other-business
  26. https://qt4cg.org/meeting/agenda/2025/11-18.html
  27. https://qt4cg.org/meeting/minutes/2025/11-11.html
  28. https://qt4cg.org/meeting/minutes/2025/11-18.html#technical-agenda
  29. https://qt4cg.org/dashboard/#pr-2256
  30. https://qt4cg.org/dashboard/#pr-2247
  31. https://qt4cg.org/dashboard/#pr-2160
  32. https://qt4cg.org/dashboard/#pr-2124
  33. https://qt4cg.org/dashboard/#pr-2071
  34. https://qt4cg.org/dashboard/#pr-2019
  35. https://qt4cg.org/dashboard/#pr-2293
  36. https://qt4cg.org/dashboard/#pr-2290
  37. https://github.com/qt4cg/qtspecs/issues/2252
  38. https://github.com/qt4cg/qtspecs/issues/1618
  39. https://qt4cg.org/dashboard/#pr-2246
  40. https://qt4cg.org/dashboard/#pr-2295
  41. https://qt4cg.org/dashboard/#pr-2289
  42. https://qt4cg.org/dashboard/#pr-2286
  43. https://qt4cg.org/dashboard/#pr-2285
  44. https://qt4cg.org/dashboard/#pr-2282

                                        Be seeing you,
                                          norm

--
Norm Tovey-Walsh
Saxonica

Received on Tuesday, 18 November 2025 17:35:07 UTC