- From: Kay, Michael <Michael.Kay@softwareag.com>
- Date: Mon, 22 Sep 2003 12:22:16 +0200
- To: public-qt-comments@w3.org
- Message-ID: <DFF2AC9E3583D511A21F0008C7E62106073DD135@daemsg02.software-ag.de>
In comment SAG-FO-02 [1] Software AG proposed a radical simplification of the operators and functions provided for handling durations, dates, and times. [1] http://lists.w3.org/Archives/Public/public-qt-comments/2003Jun/0087.html The ideas behind the "durations" part of this proposal were discussed at the joint WG meeting on 15 September 2003. Although there were some reservations expressed both on the technical content and on the feasibility of making such a substantial change at this stage in the process, there was sufficient enthusiasm to ask for a detailed change proposal to be prepared for consideration. It was suggested that although the WG might not be able to make a decision on this change proposal in time for the next draft, it might solicit public feedback on the proposal in its own right. The proposal follows. I have used section numbers from an internal editor's draft dated 3 Sep 2003, but I believe there are no material differences from the last published WD in the affected areas. This note is a detailed proposal for implementing the replacement of functions and operators on the two data types xdt:dayTimeDuration and xdt:yearMonthDuration by functions and operators that represent these quantities by numbers. The rationale for this proposal is as follows: * It significantly reduces the number of functions, and the number of special cases that need to be handled by polymorphic functions such as sum() and avg() * It avoids the need to introduce new data types beyond the types already defined in XML Schema * It enables durations to be manipulated using the full power of all the numeric operators and functions * It thus brings durations into line with other units of measure such as distances, weights, temperatures and voltages * It makes it much easier to perform computations such as: calculating an average speed by dividing the distance travelled by the difference between the finish time and start time calculating a productivity measure by dividing the number of work items processed by the length of time taken to process them calculating a payment by multiplying a duration by an hourly rate. calculating the ratio between two durations. Two of these use cases are explored in more detail at the end of the proposal. * It is well aligned with future directions being discussed by the Schema WG, without being dependent on any changes to XML Schema. The general thrust of the proposal is that in nearly all computations involving durations, it is appropriate to use the S.I. unit of time, namely the second, and to express the number of seconds as a value of type xs:double. In some calculations involving civil time using the Gregorian calendar, it is more appropriate to express a duration as a number of calendar months, or as the combination of months and seconds. All other units of time encountered in the Gregorian calendar can readily be converted into months or seconds: for example a year is 12 months, and a day is 86400 seconds. Functions are provided to convert between values of the xs:duration type defined in XML Schema and a (months, seconds) pair. All computations involving durations, other than those that also involve dates and times, rely entirely on numeric arithmetic. Similarly, computations that add durations to a date or time express the duration as a numeric quantity. Timezones are represented as a displacement from UTC measured as an integer number of seconds. The changes proposed are as follows. DATA MODEL 3.6.1: timezones are now represented as an integer number of seconds, not as an xdt:dayTimeDuration FUNCTIONS AND OPERATORS This uses the section numbering of the 3 September 2003 internal draft. 1.6 xs:dateTime, xs:date and xs:time values Change this section as follows: xs:dateTime, xs:date and xs:time values are represented in the [XQuery 1.0 and XPath 2.0 Data Model] as tuples: a normalized value with timezone Z and a timezone represented as a number of seconds difference from UTC, expressed as a value of type xs:integer. Lexical representations of xs:dateTime, xs:date and xs:time that have a timezone are converted to timezone Z as defined by [XML Schema Part 2: Datatypes] and the timezone in the lexical representation is converted to an xs:integer value (for example, the timezone +01:00 is represented as +3600). Lexical representations that do not contain a timezone are given a normalized value with the timezone Z and the timezone part of the value set to the empty sequence "()". 1.6.1 Examples * A dateTime with lexical representation 1999-05-31T05:00:00 has a value represented by the tuple (1999-05-31T05:00:00Z, ()) * A dateTime with lexical representation 1999-05-31T13:20:00-05:00 has a value represented by the tuple (1999-05-31T18:20:00Z, -18000) 9.1 Duration Date and Time Types Delete the paragraphs: "In addition, they are defined on the [9.2 Two Totally Ordered Subtypes of Duration]: * xdt:yearMonthDuration * xdt:dayTimeDuration No operators are defined on the [XML Schema Part 2: Datatypes] datatype xs:duration. Appendix C.5 Working With xs:duration Values discusses how to work with xs:duration values. " Replace them with: The elapsed time between two dates, times, or dateTimes is generally handled as a number of seconds, expressed as an xs:double. Some functions are also provided that manipulate a duration as an integer number of months. All arithmetic, comparison, and sorting of durations is achieved by expressing the duration as an xs:integer number of months plus an xs:double number of seconds, and then manipulating these values using conventional numeric arithmetic. Note: there are two reasons that xs:double has been chosen to represent the number of seconds rather than xs:decimal. Firstly, floating point arithmetic is generally preferred when performing calculations involving units of measure on a continuous scale. Secondly, by using xs:double as the expected type of a duration value, we take advantage of the XPath numeric promotion rules, which allow the value to be supplied as an xs:integer, xs:float, or xs:decimal as well as an xs:double. 9.1.1 CONFORMANCE NOTE Delete the paragraph starting "The value spaces of the two totally ordered subtypes of xs:duration..." 9.2 Two totally ordered subtypes of Duration Delete the existing section. Replace it with a new section: (9.2) Functions on xs:duration values In [XML Schema] an xs:duration is defined as a 6-tuple with numeric components representing the number of years, months, days, hours, minutes, and seconds. Although the value representing two years is distinct in the value space from the value representing 24 months, the functions defined here treat these values as being equivalent. Thus, for all computational purposes, a duration is regarded as a 2-tuple consisting of an xs:integer number of months and an xs:double number of seconds. Either both these numbers will be positive, or both will be negative, or at least one will be zero. The number of months is obtained as (12*years + months), while the number of seconds is obtained as (((((24*days + hours)*60) + minutes)*60) + seconds). 9.2.1 get-months-from-duration fn:get-months-from-duration ($arg1 as xs:duration) as xs:integer Returns the number of months represented by the years and months components of the duration, that is, the value of 12*years + months. EXAMPLE fn:get-months-from-duration(xs:duration('P1Y8M5D')) returns 20 9.2.2 get-seconds-from-duration fn:get-seconds-from-duration ($arg1 as xs:duration) as xs:double Returns the number of seconds represented by the days, hours, minutes, and seconds components of the duration, that is, the value of (((((24*days + hours)*60) + minutes)*60) + seconds). EXAMPLE fn:get-seconds-from-duration(xs:duration('P1Y8M5DT12H30M')) returns 477000e0 9.2.3 make-duration fn:make-duration($months as xs:integer, $seconds as xs:double) returns xs:duration Returns a normalized xs:duration value containing the given number of months and the given number of seconds. An xs:duration value is normalized by ensuring that the number of months is less than 12, the number of hours is less than 24, the number of minutes is less than 60, and the number of seconds is less than 60. Note: there is no similar limit on the number of days or years. It is an error [make-duration: components have different sign] if the $months argument is greater than zero and the $seconds argument is less than zero, or if the $months argument is less than zero and the $seconds argument is greater than zero. EXAMPLES fn:make-duration(18, 477000) returns the xs:duration P1Y6M5DT12H30M fn:make-duration(240, 0) returns the xs:duration P20Y fn:make-duration(0, -90.25) returns the xs:duration -PT1M30.25S 9.3 Comparison of Duration, Date, and Time Values Delete the functions: op:yearMonthDuration-equal op:yearMonthDuration-less-than op:yearMonthDuration-greater-than op:dayTimeDuration-equal op:dayTimeDuration-less-than op:dayTimeDuration-greater-than from the table; and delete the corresponding sections that define these functions (9.3.1 to 9.3.6). In the paragraph after the table, delete the sentence: "A full complement of comparison and arithmetic functions are defined on the two subtypes of duration described in 9.2 Two Totally Ordered Subtypes of Duration.". 9.4 Component Extraction Functions on Duration, Date, and Time Values Delete the functions: fn:get-years-from-yearMonthDuration fn:get-months-from-yearMonthDuration fn:get-days-from-dayTimeDuration fn:get-hours-from-dayTimeDuration fn:get-minutes-from-dayTimeDuration fn:get-seconds-from-dayTimeDuration from the table, and delete the corresponding subsections that define these functions (9.4.1 to 9.4.6). 9.5 Arithmetic Functions on xdt:yearMonthDuration and xdt:dayTimeDuration Delete the entire section (8 functions in the op: namespace) 9.6 Adjusting timezones on dateTime, date, and time values Change these functions so that the timezone is expressed as an xs:integer number of seconds, representing the displacement from UTC. In all the function signatures, the timezone value changes type from xdt:dayTimeDuration to xs:integer. In 9.6.1, 9.6.2, and 9.6.3, change the sentence "A dynamic error is raised (invalid timezone value) if $timezone is less than -PT14H00M or greater than PT14H00M." to "A dynamic error is raised (invalid timezone value) if $timezone is less than -50400 or greater than +50400 (representing a range from 14 hours before UTC to 14 hours after UTC)". In 9.6.1.1, 9.6.1.2, and 9.6.1.3, change the examples accordingly. 9.7 Adding and Subtracting Durations From dateTime, date, and time Replace the entire section, as follows: These functions support adding or subtracting a duration value to or from an xs:dateTime, an xs:date or an xs:time value. Most of these functions handle a duration as a number of seconds, expressed as a number of type xs:double. Some functions also support durations expressed as an xs:integer number of months. A subtraction operator is provided to find the difference between two values of type xs:dateTime, xs:date or xs:time. The result is the number of seconds that separate the two instants in time, expressed as a value of type xs:double. If either of the arguments to this operator is an xs:dateTime, xs:date or xs:time that does not contain an explicit timezone then, for the purpose of the operation, an implicit timezone, provided by the evaluation context, is assumed to be present as part of the value. Operators are provided to add or subtract a duration expressed in seconds to or from an xs:dateTime, an xs:date or an xs:time value. The effect of these operations is to first convert the given number of seconds into an xs:duration value, and then apply the rules in Appendix E of [XML Schema Part 2: Datatypes]. A function is also provided to find the difference in months between two values of type xs:date. Again, if either of the arguments to this operator is an xs:date that does not contain an explicit timezone then, for the purpose of the operation, an implicit timezone, provided by the evaluation context, is assumed to be present as part of the value. An interval in months between two xs:dateTime values can be found by casting the xs:dateTime values to xs:date. A function is provided to add a duration expressed as an integer number of months to a value of type xs:date. The effect of this operation is to first convert the given number of months into an xs:duration value, and then apply the rules in Appendix E of [XML Schema Part 2: Datatypes]. 9.7.1 op:subtract-dateTimes op:subtract-dateTimes($arg1 as xs:dateTime, $arg2 as xs:dateTime) as xs:double Returns an xs:double value that corresponds to the number of seconds between the normalized value of $arg1 and the normalized value of $arg2. If either argument is the empty sequence, returns the empty sequence. If the normalized value of $arg1 precedes in time the normalized value of $arg2, the returned value is a negative number. 9.7.1.1 Examples * op:subtract-dateTimes(xs:dateTime("2000-10-30T11:12:00"), xs:dateTime("1999-11-28T09:00:00")) returns 29124720e0. 9.7.2 op:subtract-dates op:subtract-dates($arg1 as xs:date, $arg2 as xs:date) as xs:double Returns the number of seconds between the normalized value of $arg1 and the normalized value of $arg2. If either argument is the empty sequence, returns the empty sequence. If the normalized value of $arg1 precedes in time the normalized value of $arg2, the returned value is a negative number. Note: the difference in days between the two dates can be obtained by dividing the result by 86400 (that is, 24*60*60). The difference is not necessarily a whole number of days, because the dates may be in different timezones. 9.7.2.1 Examples * op:subtract-dates(xs:date("2000-10-30"), xs:date("1999-11-28")) returns 29116800e0. 9.7.3 op:subtract-times op:subtract-times($arg1 as xs:time, $arg2 as xs:time) as xs:double Returns the number of seconds between the normalized value of $arg2 and the normalized value of $arg1. If either argument is the empty sequence, returns the empty sequence. The result is greater than or equal to zero and less than 86400; that is, the subtraction is performed modulo the length of a day. 9.7.3.1 Examples Assume that the evaluation context provides an implicit timezone value of -5:00. * op:subtract-times(xs:time("11:12:00Z"), xs:time("04:00:00")) returns 7920e0 (corresponding to 2 hours and 12 minutes). 9.7.4 op:add-seconds-to-dateTime op:add-seconds-to-dateTime($arg1 as xs:dateTime, $arg2 as xs:double) as xs:dateTime Returns a value of type xs:dateTime that represents the instant in time that is $arg2 seconds later than $arg1; or if $arg2 is negative, the instant that is fn:abs($arg2) seconds earlier. The timezone of the result will be the same as the timezone of $arg1, or absent if the timezone of $arg1 is absent. 9.7.4.1 Examples op:add-seconds-to-dateTime(xs:dateTime('2003-01-31T23:00:00'), 7200) returns the xs:dateTime 2003-02-01T01:00:00 9.7.5 op:add-seconds-to-date op:add-seconds-to-date($arg1 as xs:date, $arg2 as xs:double) as xs:date Returns a value of type xs:date that contains the instant in time that is $arg2 seconds later than the start of the date represented by $arg1; or if $arg2 is negative, the instant that is fn:abs($arg2) seconds earlier. The timezone of the result will be the same as the timezone of $arg1, or absent if the timezone of $arg1 is absent. 9.7.5.1 Examples op:add-seconds-to-date(xs:date('2003-01-31'), 86400) returns the xs:date 2003-02-01 op:add-seconds-to-date(xs:date('2003-01-31'), 86399) returns the xs:date 2003-01-31 9.7.6 op:add-seconds-to-time op:add-seconds-to-time($arg1 as xs:time, $arg2 as xs:double) as xs:time Returns a value of type xs:time that represents the time that is $arg2 seconds later than the time represented by $arg1; or if $arg2 is negative, the instant that is fn:abs($arg2) seconds earlier. This effectively does arithmetic with a modulus of 24 hours. The timezone of the result will be the same as the timezone of $arg1, or absent if the timezone of $arg1 is absent. 9.7.6.1 Examples op:add-seconds-to-time(xs:time('T12:00:00'), 7200) returns the xs:time T14:00:00 op:add-seconds-to-time(xs:time('T23:00:00'), 7200) returns the xs:time T01:00:00 op:add-seconds-to-time(xs:time('T01:00:00'), -7200) returns the xs:time T23:00:00 9.7.7 op:subtract-seconds-from-dateTime op:subtract-seconds-from-dateTime($arg1 as xs:dateTime, $arg2 as xs:double) as xs:dateTime Returns the same result as op:add-seconds-to-dateTime($arg1, -$arg2). 9.7.7.1 Example op:subtract-seconds-from-dateTime(xs:dateTime('2003-01-31T23:00:00'), 7200) returns the xs:dateTime 2003-01-31T21:00:00 9.7.8 op:subtract-seconds-from-date op:subtract-seconds-from-date ($arg1 as xs:date, $arg2 as xs:double) as xs:date Returns the same result as op:add-seconds-to-date($arg1, -$arg2). 9.7.8.1 Example op:subtract-seconds-from-date(xs:date('2003-02-01'), 86400) returns the xs:date 2003-01-31 9.7.9 op:subtract-seconds-from-time op:subtract-seconds-from-time($arg1 as xs:time, $arg2 as xs:double) as xs:time Returns the same result as op:add-seconds-to-time($arg1, -$arg2). 9.7.9.1 Example op:subtract-seconds-from-time(xs:time('T01:00:00'), 3600) returns the time T00:00:00 op:subtract-seconds-from-time(xs:time('T01:00:00'), 3601) returns the time T23:59:59 9.7.10 fn:add-months-to-date fn:add-months-to-date($arg1 as xs:date, $arg2 as xs:integer) as xs:date Returns a value of type xs:date representing the date that is $arg2 months later than $arg1; or if $arg2 is negative, the instant that is fn:abs($arg2) months earlier. The algorithm used by this function is to convert the number of months to a value of type xs:duration by calling the function fn:make-duration with $arg2 as the first argument and zero as the second argument, and then to apply the algorithm given in Appendix E of [XML Schema Part 2: Datatypes]. 9.7.10.1 Examples fn:add-months-to-date(xs:date('2003-10-05'), 10) returns the date 2004-08-05 fn:add-months-to-date(xs:date('2003-10-05'), -3) returns the date 2003-07-05 fn:add-months-to-date(xs:date('2003-10-31'), 4) returns the date 2004-02-29 9.7.11 fn:subtract-dates-giving-months fn:subtract-dates-giving-months($arg1 as xs:date, $arg2 as xs:date) as xs:integer If $arg2 is later in time than $arg1, the function returns the largest integer N such that fn:add-months-to-date($arg2, N) is less than or equal to $arg1. Otherwise, returns the value of - fn:subtract-dates-giving-months($arg2, $arg1). 9.7.11.1 Examples fn:subtract-dates-giving-months(xs:date('2003-10-10'), xs:date(2003-09-09')) returns 1 fn:subtract-dates-giving-months(xs:date('2003-10-10'), xs:date(2004-09-09')) returns -10 15.3 Aggregate functions Delete all mentions of duration types in the specifications of these examples, and in the examples of their use. 16.8 implicit-timezone Change the signature to return xs:integer. Indicate in the description that the timezone is returned as an integer number of seconds displacement from UTC. 17.1 Casting Table Remove the rows and columns relating to xdt:dayTimeDuration and xdt:yearMonthDuration from the table. Remove the explanation of the labels dTD and yMD. 17.4 Casting within a branch of the type hierarchy Delete " and 17.9 Casting to duration types for rules regarding casting to xdt:yearMonthDuration and xdt:dayTimeDuration." at the end of the section. 17.9 Casting to duration types Delete this section. Appendix C.5 This section is replaced with the following: This document does not define equality on xs:duration values. Nor does it define other comparison functions on such values. Users wanting to work with xs:duration values should convert the duration into a number of seconds and/or months using the functions fn:get-months-from-duration and fn:get-seconds-from-duration; the reverse conversion can be achieved using the function fn:make-duration. One way of comparing two xs:duration values for equality is to compare their months and seconds components separately and return equal if both corresponding components are equal. This could be written as follows: XSLT implementation <xsl:function name="eg:duration-equal" as="xs:boolean"> <xsl:param name="arg1" as="xs:duration"/> <xsl:param name="arg2" as="xs:duration"/> <xsl:sequence select="get-months-from-duration($arg1) = get-months-from-duration($arg2) and get-seconds-from-duration($arg1) = get-seconds-from-duration($arg2)" /> </xsl:function> XQuery implementation declare function eg:duration-equal($arg1 as xs:duration, $arg2 as xs:duration) as xs:boolean { get-months-from-duration($arg1) = get-months-from-duration($arg2) and get-seconds-from-duration($arg1) = get-seconds-from-duration($arg2) } There is no reliable way of comparing whether one xs:duration value is greater than another, because there is no definitive answer to the question whether 30 days is greater than one month. One pragmatic approach to sorting durations is to use as a sort key the expression: get-months-from-duration(.) * 365.242199 div 12.0 + (get-seconds-from-duration(.) div 86400) Here 365.242199 is the average number of days in a year, and 86400 is the number of seconds in a normal day. XPATH/XQUERY LANGUAGE BOOK Using the 22 August 2003 published Working Draft as a baseline: In 2.1.1.1 Predefined types, delete list items 3 and 4 which refer to the xdt:dayTimeDuration and xdt:yearMonthDuration data types. In 2.1.2 Dynamic context, change the implicit timezone so it is expressed as an integer number of seconds (displacement from UTC). In 3.1.1 Literals, change the example that uses an xdt:dayTimeDuration (perhaps to use an xs:duration instead). In 3.4 Arithmetic Expressions, change the date subtraction example from: * Subtraction of two date values results in a value of type xdt:dayTimeDuration: $emp/hiredate - $emp/birthdate to: * Two date values may be subtracted to return a duration in seconds. Dividing by 86400 gives the difference in days: ($emp/hiredate - $emp/birthdate) div 86400 In 3.10.4 Constructor functions, remove the references to the types xdt:dayTimeDuration and xdt:yearMonthDuration, and the example that uses these. In Appendix B.2, the operator mapping table: (a) Delete the 37 rows that have xdt:dayTimeDuration or xdt:yearMonthDuration in the Type(A) or Type(B) column. (b) Change the three rows for subtracting date-date, time-time, and dateTime-dateTime so the return type is xs:double (c) Add the following 9 rows: A + B xs:date xs:double op:add-seconds-to-date(A,B) xs:date A + B xs:time xs:double op:add-seconds-to-time(A,B) xs:time A + B xs:dateTime xs:double op:add-seconds-to-dateTime(A,B) xs:dateTime A + B xs:double xs:date op:add-seconds-to-date(B,A) xs:date A + B xs:double xs:time op:add-seconds-to-time(B,A) xs:time A + B xs:double xs:dateTime op:add-seconds-to-dateTime(A,B) xs:dateTime A - B xs:date xs:date op-subtract-dates(A,B) xs:double A - B xs:time xs:time op-subtract-times(A,B) xs:double A - B xs:dateTime xs:dateTime op-subtract-dateTimes(A,B) xs:double FORMAL SEMANTICS 3.1.1.1: remove references to xdt:dayTimeDuration and xdt:yearMonthDuration types Appendix B.2: align the operator mapping table with the revised version in the language document USE CASES Use Case 1: Timesheets Scenario: Staff complete timesheets giving for each day, the time they start and finish work. The query calculates the total hours worked and multiplies this by an hourly rate of pay to calculate the payment due. Source document: <timesheets> <employee nr="12345" rate="10.50"> <attendance date="2003-01-15" start="T09:00:00" end="T17:00:00"/> <attendance date="2003-01-16" start="T09:00:00" end="T18:00:00"/> <attendance date="2003-01-17" start="T09:00:00" end="T17:00:00"/> <attendance date="2003-01-18" start="T09:00:00" end="T16:00:00"/> <attendance date="2003-01-19" start="T09:00:00" end="T16:00:00"/> </employee> </timesheets> Solution: sum( for $a in $emp/attendance return ($a/@end - $a/@start) div 3600) * $emp/@rate Note that this works even for shifts that cross midnight. Using the functions in the published WD, the subtraction of two times yields a dayTimeDuration. It is not possible to multiply this directly by an hourly rate to obtain the amount payable; instead it is necessary to extract the individual components of the dayTimeDuration, use these to compute the number of hours worked, and then do the multiplication. Use Case 2: Average Speed Scenario: The source document contains the start times and end times for each driver in each stage of a motor rally. The query calculates the average speed. Source document <timings> <driver name="Mansell"> <stage distance="520.5" start="2002-10-15T09:02:04.1" finish="2002-10-15T17:12:16.2"/> <stage distance="430.3" start="2002-11-15T08:32:05.7" finish="2002-11-15T15:55:13.6"/> </driver> </timings> Solution: avg(for $s in driver/stage return @distance div (@finish - @start)) div 3600 Using the functions in the published WD, the subtraction of two dateTimes yields a dayTimeDuration. It is not possible to divide a distance by a dayTimeDuration to obtain a speed; instead, it is necessary to convert the duration to a number of seconds by extracting all the components to compute the travel time in seconds. These use cases demonstrate that although under this proposal the specification has vastly fewer functions to manipulate durations, practical calculations involving durations actually become significantly easier. Michael Kay Software AG
Received on Monday, 22 September 2003 06:22:25 UTC