SAG-FO-02 follow-up: Durations

In comment SAG-FO-02 [1] Software AG proposed a radical simplification of
the operators and functions provided for handling durations, dates, and
times.

[1] http://lists.w3.org/Archives/Public/public-qt-comments/2003Jun/0087.html


The ideas behind the "durations" part of this proposal were discussed at the
joint WG meeting on 15 September 2003. Although there were some reservations
expressed both on the technical content and on the feasibility of making
such a substantial change at this stage in the process, there was sufficient
enthusiasm to ask for a detailed change proposal to be prepared for
consideration. It was suggested that although the WG might not be able to
make a decision on this change proposal in time for the next draft, it might
solicit public feedback on the proposal in its own right. 

The proposal follows. I have used section numbers from an internal editor's
draft dated 3 Sep 2003, but I believe there are no material differences from
the last published WD in the affected areas.


This note is a detailed proposal for implementing the replacement of
functions and operators on the two data types xdt:dayTimeDuration and
xdt:yearMonthDuration by functions and operators that represent these
quantities by numbers.


The rationale for this proposal is as follows:

* It significantly reduces the number of functions, and the number of
special cases that need to be handled by polymorphic functions such as sum()
and avg()

* It avoids the need to introduce new data types beyond the types already
defined in XML Schema

* It enables durations to be manipulated using the full power of all the
numeric operators and functions

* It thus brings durations into line with other units of measure such as
distances, weights, temperatures and voltages

* It makes it much easier to perform computations such as:

   calculating an average speed by dividing the distance travelled by the
difference between the finish time and start time

   calculating a productivity measure by dividing the number of work items
processed by the length of time taken to process them

   calculating a payment by multiplying a duration by an hourly rate.

   calculating the ratio between two durations.

  Two of these use cases are explored in more detail at the end of the
proposal.

* It is well aligned with future directions being discussed by the Schema
WG, without being dependent on any changes to XML Schema.

The general thrust of the proposal is that in nearly all computations
involving durations, it is appropriate to use the S.I. unit of time, namely
the second, and to express the number of seconds as a value of type
xs:double. In some calculations involving civil time using the Gregorian
calendar, it is more appropriate to express a duration as a number of
calendar months, or as the combination of months and seconds. All other
units of time encountered in the Gregorian calendar can readily be converted
into months or seconds: for example a year is 12 months, and a day is 86400
seconds.

Functions are provided to convert between values of the xs:duration type
defined in XML Schema and a (months, seconds) pair. All computations
involving durations, other than those that also involve dates and times,
rely entirely on numeric arithmetic. Similarly, computations that add
durations to a date or time express the duration as a numeric quantity.
Timezones are represented as a displacement from UTC measured as an integer
number of seconds.


The changes proposed are as follows.

DATA MODEL

3.6.1: timezones are now represented as an integer number of seconds, not as
an xdt:dayTimeDuration


FUNCTIONS AND OPERATORS

This uses the section numbering of the 3 September 2003 internal draft.

1.6 xs:dateTime, xs:date and xs:time values

Change this section as follows:

xs:dateTime, xs:date and xs:time values are represented in the [XQuery 1.0
and XPath 2.0 Data Model] as tuples: a normalized value with timezone Z and
a timezone represented as a number of seconds difference from UTC, expressed
as a value of type xs:integer. Lexical representations of xs:dateTime,
xs:date and xs:time that have a timezone are converted to timezone Z as
defined by [XML Schema Part 2: Datatypes] and the timezone in the lexical
representation is converted to an xs:integer value (for example, the
timezone +01:00 is represented as +3600). Lexical representations that do
not contain a timezone are given a normalized value with the timezone Z and
the timezone part of the value set to the empty sequence "()".

1.6.1 Examples

*	A dateTime with lexical representation 1999-05-31T05:00:00 has a
value represented by the tuple (1999-05-31T05:00:00Z, ())

*	A dateTime with lexical representation 1999-05-31T13:20:00-05:00 has
a value represented by the tuple (1999-05-31T18:20:00Z, -18000)


9.1 Duration Date and Time Types

Delete the paragraphs:

"In addition, they are defined on the [9.2 Two Totally Ordered Subtypes of
Duration]:

    * xdt:yearMonthDuration
    * xdt:dayTimeDuration

No operators are defined on the [XML Schema Part 2: Datatypes] datatype
xs:duration. Appendix C.5 Working With xs:duration Values discusses how to
work with xs:duration values. "

Replace them with:

The elapsed time between two dates, times, or dateTimes is generally handled
as a number of seconds, expressed as an xs:double. Some functions are also
provided that manipulate a duration as an integer number of months. All
arithmetic, comparison, and sorting of durations is achieved by expressing
the duration as an xs:integer number of months plus an xs:double number of
seconds, and then manipulating these values using conventional numeric
arithmetic.

Note: there are two reasons that xs:double has been chosen to represent the
number of seconds rather than xs:decimal. Firstly, floating point arithmetic
is generally preferred when performing calculations involving units of
measure on a continuous scale. Secondly, by using xs:double as the expected
type of a duration value, we take advantage of the XPath numeric promotion
rules, which allow the value to be supplied as an xs:integer, xs:float, or
xs:decimal as well as an xs:double. 

9.1.1 CONFORMANCE NOTE

Delete the paragraph starting "The value spaces of the two totally ordered
subtypes of xs:duration..."

9.2 Two totally ordered subtypes of Duration

Delete the existing section. Replace it with a new section:

(9.2) Functions on xs:duration values

In [XML Schema] an xs:duration is defined as a 6-tuple with numeric
components representing the number of years, months, days, hours, minutes,
and seconds. Although the value representing two years is distinct in the
value space from the value representing 24 months, the functions defined
here treat these values as being equivalent. Thus, for all computational
purposes, a duration is regarded as a 2-tuple consisting of an xs:integer
number of months and an xs:double number of seconds. Either both these
numbers will be positive, or both will be negative, or at least one will be
zero. The number of months is obtained as (12*years + months), while the
number of seconds is obtained as (((((24*days + hours)*60) + minutes)*60) +
seconds).

9.2.1 get-months-from-duration

fn:get-months-from-duration ($arg1 as xs:duration) as xs:integer

Returns the number of months represented by the years and months components
of the duration, that is, the value of 12*years + months.

EXAMPLE

fn:get-months-from-duration(xs:duration('P1Y8M5D')) returns 20

9.2.2 get-seconds-from-duration

fn:get-seconds-from-duration ($arg1 as xs:duration) as xs:double

Returns the number of seconds represented by the days, hours, minutes, and
seconds components of the duration, that is, the value of (((((24*days +
hours)*60) + minutes)*60) + seconds).

EXAMPLE

fn:get-seconds-from-duration(xs:duration('P1Y8M5DT12H30M')) returns 477000e0

9.2.3 make-duration

fn:make-duration($months as xs:integer, $seconds as xs:double) returns
xs:duration

Returns a normalized xs:duration value containing the given number of months
and the given number of seconds. An xs:duration value is normalized by
ensuring that the number of months is less than 12, the number of hours is
less than 24, the number of minutes is less than 60, and the number of
seconds is less than 60.

Note: there is no similar limit on the number of days or years.

It is an error [make-duration: components have different sign] if the
$months argument is greater than zero and the $seconds argument is less than
zero, or if the $months argument is less than zero and the $seconds argument
is greater than zero.

EXAMPLES

fn:make-duration(18, 477000) returns the xs:duration P1Y6M5DT12H30M

fn:make-duration(240, 0) returns the xs:duration P20Y

fn:make-duration(0, -90.25) returns the xs:duration -PT1M30.25S


9.3 Comparison of Duration, Date, and Time Values

Delete the functions:

op:yearMonthDuration-equal
op:yearMonthDuration-less-than
op:yearMonthDuration-greater-than
op:dayTimeDuration-equal
op:dayTimeDuration-less-than
op:dayTimeDuration-greater-than

from the table; and delete the corresponding sections that define these
functions (9.3.1 to 9.3.6).

In the paragraph after the table, delete the sentence: "A full complement of
comparison and arithmetic functions are defined on the two subtypes of
duration described in 9.2 Two Totally Ordered Subtypes of Duration.".

9.4 Component Extraction Functions on Duration, Date, and Time Values

Delete the functions:

fn:get-years-from-yearMonthDuration
fn:get-months-from-yearMonthDuration
fn:get-days-from-dayTimeDuration
fn:get-hours-from-dayTimeDuration
fn:get-minutes-from-dayTimeDuration
fn:get-seconds-from-dayTimeDuration

from the table, and delete the corresponding subsections that define these
functions (9.4.1 to 9.4.6).

9.5 Arithmetic Functions on xdt:yearMonthDuration and xdt:dayTimeDuration

Delete the entire section (8 functions in the op: namespace)

9.6 Adjusting timezones on dateTime, date, and time values

Change these functions so that the timezone is expressed as an xs:integer
number of seconds, representing the displacement from UTC.

In all the function signatures, the timezone value changes type from
xdt:dayTimeDuration to xs:integer.

In 9.6.1, 9.6.2, and 9.6.3, change the sentence "A dynamic error is raised
(invalid timezone value) if $timezone is less than -PT14H00M or greater than
PT14H00M." to "A dynamic error is raised (invalid timezone value) if
$timezone is less than -50400 or greater than +50400 (representing a range
from 14 hours before UTC to 14 hours after UTC)".

In 9.6.1.1, 9.6.1.2, and 9.6.1.3, change the examples accordingly.

9.7 Adding and Subtracting Durations From dateTime, date, and time

Replace the entire section, as follows:

These functions support adding or subtracting a duration value to or from an
xs:dateTime, an xs:date or an xs:time value. Most of these functions handle
a duration as a number of seconds, expressed as a number of type xs:double.
Some functions also support durations expressed as an xs:integer number of
months.

A subtraction operator is provided to find the difference between two values
of type xs:dateTime, xs:date or xs:time. The result is the number of seconds
that separate the two instants in time, expressed as a value of type
xs:double. If either of the arguments to this operator is an xs:dateTime,
xs:date or xs:time that does not contain an explicit timezone then, for the
purpose of the operation, an implicit timezone, provided by the evaluation
context, is assumed to be present as part of the value.

Operators are provided to add or subtract a duration expressed in seconds to
or from an xs:dateTime, an xs:date or an xs:time value. The effect of these
operations is to first convert the given number of seconds into an
xs:duration value, and then apply the rules in Appendix E of [XML Schema
Part 2: Datatypes].

A function is also provided to find the difference in months between two
values of type xs:date. Again, if either of the arguments to this operator
is an xs:date that does not contain an explicit timezone then, for the
purpose of the operation, an implicit timezone, provided by the evaluation
context, is assumed to be present as part of the value. An interval in
months between two xs:dateTime values can be found by casting the
xs:dateTime values to xs:date.

A function is provided to add a duration expressed as an integer number of
months to a value of type xs:date. The effect of this operation is to first
convert the given number of months into an xs:duration value, and then apply
the rules in Appendix E of [XML Schema Part 2: Datatypes].


9.7.1 op:subtract-dateTimes

op:subtract-dateTimes($arg1 as xs:dateTime, $arg2 as xs:dateTime) as
xs:double

Returns an xs:double value that corresponds to the number of seconds between
the normalized value of $arg1 and the normalized value of $arg2. If either
argument is the empty sequence, returns the empty sequence. If the
normalized value of $arg1 precedes in time the normalized value of $arg2,
the returned value is a negative number.


9.7.1.1 Examples

    * op:subtract-dateTimes(xs:dateTime("2000-10-30T11:12:00"),
xs:dateTime("1999-11-28T09:00:00")) returns 29124720e0. 


9.7.2 op:subtract-dates

op:subtract-dates($arg1 as xs:date, $arg2 as xs:date) as xs:double

Returns the number of seconds between the normalized value of $arg1 and the
normalized value of $arg2. If either argument is the empty sequence, returns
the empty sequence. If the normalized value of $arg1 precedes in time the
normalized value of $arg2, the returned value is a negative number.

Note: the difference in days between the two dates can be obtained by
dividing the result by 86400 (that is, 24*60*60). The difference is not
necessarily a whole number of days, because the dates may be in different
timezones.


9.7.2.1 Examples

  * op:subtract-dates(xs:date("2000-10-30"), xs:date("1999-11-28")) returns
29116800e0. 


9.7.3 op:subtract-times

op:subtract-times($arg1 as xs:time, $arg2 as xs:time) as xs:double

Returns the number of seconds between the normalized value of $arg2 and the
normalized value of $arg1. If either argument is the empty sequence, returns
the empty sequence. The result is greater than or equal to zero and less
than 86400; that is, the subtraction is performed modulo the length of a
day.


9.7.3.1 Examples

 Assume that the evaluation context provides an implicit timezone value of
-5:00.

    * op:subtract-times(xs:time("11:12:00Z"), xs:time("04:00:00")) returns
7920e0 (corresponding to 2 hours and 12 minutes). 

9.7.4 op:add-seconds-to-dateTime

op:add-seconds-to-dateTime($arg1 as xs:dateTime, $arg2 as xs:double) as
xs:dateTime

Returns a value of type xs:dateTime that represents the instant in time that
is $arg2 seconds later than $arg1; or if $arg2 is negative, the instant that
is fn:abs($arg2) seconds earlier. The timezone of the result will be the
same as the timezone of $arg1, or absent if the timezone of $arg1 is absent.

9.7.4.1 Examples

op:add-seconds-to-dateTime(xs:dateTime('2003-01-31T23:00:00'), 7200) returns
the xs:dateTime 2003-02-01T01:00:00

9.7.5 op:add-seconds-to-date

op:add-seconds-to-date($arg1 as xs:date, $arg2 as xs:double) as xs:date

Returns a value of type xs:date that contains the instant in time that is
$arg2 seconds later than the start of the date represented by $arg1; or if
$arg2 is negative, the instant that is fn:abs($arg2) seconds earlier. The
timezone of the result will be the same as the timezone of $arg1, or absent
if the timezone of $arg1 is absent.

9.7.5.1 Examples

op:add-seconds-to-date(xs:date('2003-01-31'), 86400) returns the xs:date
2003-02-01

op:add-seconds-to-date(xs:date('2003-01-31'), 86399) returns the xs:date
2003-01-31


9.7.6 op:add-seconds-to-time

op:add-seconds-to-time($arg1 as xs:time, $arg2 as xs:double) as xs:time

Returns a value of type xs:time that represents the time that is $arg2
seconds later than the time represented by $arg1; or if $arg2 is negative,
the instant that is fn:abs($arg2) seconds earlier. This effectively does
arithmetic with a modulus of 24 hours. The timezone of the result will be
the same as the timezone of $arg1, or absent if the timezone of $arg1 is
absent.

9.7.6.1 Examples

op:add-seconds-to-time(xs:time('T12:00:00'), 7200) returns the xs:time
T14:00:00

op:add-seconds-to-time(xs:time('T23:00:00'), 7200) returns the xs:time
T01:00:00

op:add-seconds-to-time(xs:time('T01:00:00'), -7200) returns the xs:time
T23:00:00


9.7.7 op:subtract-seconds-from-dateTime

op:subtract-seconds-from-dateTime($arg1 as xs:dateTime, $arg2 as xs:double)
as xs:dateTime

Returns the same result as op:add-seconds-to-dateTime($arg1, -$arg2).

9.7.7.1 Example

op:subtract-seconds-from-dateTime(xs:dateTime('2003-01-31T23:00:00'), 7200)
returns the xs:dateTime 2003-01-31T21:00:00


9.7.8 op:subtract-seconds-from-date

op:subtract-seconds-from-date ($arg1 as xs:date, $arg2 as xs:double) as
xs:date

Returns the same result as op:add-seconds-to-date($arg1, -$arg2).

9.7.8.1 Example

op:subtract-seconds-from-date(xs:date('2003-02-01'), 86400) returns the
xs:date 2003-01-31


9.7.9 op:subtract-seconds-from-time

op:subtract-seconds-from-time($arg1 as xs:time, $arg2 as xs:double) as
xs:time

Returns the same result as op:add-seconds-to-time($arg1, -$arg2).

9.7.9.1 Example

op:subtract-seconds-from-time(xs:time('T01:00:00'), 3600) returns the time
T00:00:00

op:subtract-seconds-from-time(xs:time('T01:00:00'), 3601) returns the time
T23:59:59
 


9.7.10 fn:add-months-to-date

fn:add-months-to-date($arg1 as xs:date, $arg2 as xs:integer) as xs:date

Returns a value of type xs:date representing the date that is $arg2 months
later than $arg1; or if $arg2 is negative, the instant that is fn:abs($arg2)
months earlier. 

The algorithm used by this function is to convert the number of months to a
value of type xs:duration by calling the function fn:make-duration with
$arg2 as the first argument and zero as the second argument, and then to
apply the algorithm given in Appendix E of [XML Schema Part 2: Datatypes].

9.7.10.1 Examples

fn:add-months-to-date(xs:date('2003-10-05'), 10) returns the date 2004-08-05

fn:add-months-to-date(xs:date('2003-10-05'), -3) returns the date 2003-07-05

fn:add-months-to-date(xs:date('2003-10-31'), 4) returns the date 2004-02-29


9.7.11 fn:subtract-dates-giving-months

fn:subtract-dates-giving-months($arg1 as xs:date, $arg2 as xs:date) as
xs:integer

If $arg2 is later in time than $arg1, the function returns the largest
integer N such that fn:add-months-to-date($arg2, N) is less than or equal to
$arg1.

Otherwise, returns the value of  - fn:subtract-dates-giving-months($arg2,
$arg1).

9.7.11.1 Examples

fn:subtract-dates-giving-months(xs:date('2003-10-10'), xs:date(2003-09-09'))
returns 1

fn:subtract-dates-giving-months(xs:date('2003-10-10'), xs:date(2004-09-09'))
returns -10


15.3 Aggregate functions

Delete all mentions of duration types in the specifications of these
examples, and in the examples of their use.


16.8 implicit-timezone

Change the signature to return xs:integer. Indicate in the description that
the timezone is returned as an integer number of seconds displacement from
UTC.

17.1 Casting Table

Remove the rows and columns relating to xdt:dayTimeDuration and
xdt:yearMonthDuration from the table. Remove the explanation of the labels
dTD and yMD.

17.4 Casting within a branch of the type hierarchy

Delete " and 17.9 Casting to duration types for rules regarding casting to
xdt:yearMonthDuration and xdt:dayTimeDuration." at the end of the section.

17.9 Casting to duration types

Delete this section.

Appendix C.5

This section is replaced with the following:

This document does not define equality on xs:duration values. Nor does it
define other comparison functions on such values. Users wanting to work with
xs:duration values should convert the duration into a number of seconds
and/or months using the functions fn:get-months-from-duration and
fn:get-seconds-from-duration; the reverse conversion can be achieved using
the function fn:make-duration.

One way of comparing two xs:duration values for equality is to compare their
months and seconds components separately and return equal if both
corresponding components are equal. This could be written as follows:

XSLT implementation

<xsl:function name="eg:duration-equal" as="xs:boolean">
  <xsl:param name="arg1" as="xs:duration"/>
  <xsl:param name="arg2" as="xs:duration"/>
  <xsl:sequence 
    select="get-months-from-duration($arg1) =
               get-months-from-duration($arg2) and
            get-seconds-from-duration($arg1) =
               get-seconds-from-duration($arg2)" />
</xsl:function>

XQuery implementation

declare function eg:duration-equal($arg1 as xs:duration, $arg2 as
xs:duration)
    as xs:boolean 
{
  get-months-from-duration($arg1) =
      get-months-from-duration($arg2) and
  get-seconds-from-duration($arg1) =
      get-seconds-from-duration($arg2)
}

There is no reliable way of comparing whether one xs:duration value is
greater than another, because there is no definitive answer to the question
whether 30 days is greater than one month. One pragmatic approach to sorting
durations is to use as a sort key the expression:

get-months-from-duration(.) * 365.242199 div 12.0
   + (get-seconds-from-duration(.) div 86400)

Here 365.242199 is the average number of days in a year, and 86400 is the
number of seconds in a normal day. 


XPATH/XQUERY LANGUAGE BOOK

Using the 22 August 2003 published Working Draft as a baseline:

In 2.1.1.1 Predefined types, delete list items 3 and 4 which refer to the
xdt:dayTimeDuration and xdt:yearMonthDuration data types.

In 2.1.2 Dynamic context, change the implicit timezone so it is expressed as
an integer number of seconds (displacement from UTC).

In 3.1.1 Literals, change the example that uses an xdt:dayTimeDuration
(perhaps to use an xs:duration instead).

In 3.4 Arithmetic Expressions, change the date subtraction example from:

*  Subtraction of two date values results in a value of type
xdt:dayTimeDuration:
     $emp/hiredate - $emp/birthdate

to:

* Two date values may be subtracted to return a duration in seconds.
Dividing by 86400 gives the difference in days:

     ($emp/hiredate - $emp/birthdate) div 86400

In 3.10.4 Constructor functions, remove the references to the types
xdt:dayTimeDuration and xdt:yearMonthDuration, and the example that uses
these.

In Appendix B.2, the operator mapping table:

(a) Delete the 37 rows that have xdt:dayTimeDuration or
xdt:yearMonthDuration in the Type(A) or Type(B) column.

(b) Change the three rows for subtracting date-date, time-time, and
dateTime-dateTime so the return type is xs:double

(c) Add the following 9 rows:

A + B  xs:date      xs:double  op:add-seconds-to-date(A,B)  xs:date
A + B  xs:time      xs:double  op:add-seconds-to-time(A,B)  xs:time
A + B  xs:dateTime  xs:double  op:add-seconds-to-dateTime(A,B)  xs:dateTime

A + B  xs:double  xs:date      op:add-seconds-to-date(B,A)  xs:date
A + B  xs:double  xs:time      op:add-seconds-to-time(B,A)  xs:time
A + B  xs:double  xs:dateTime  op:add-seconds-to-dateTime(A,B)  xs:dateTime

A - B  xs:date  xs:date  op-subtract-dates(A,B)  xs:double
A - B  xs:time  xs:time  op-subtract-times(A,B)  xs:double
A - B  xs:dateTime  xs:dateTime  op-subtract-dateTimes(A,B)  xs:double


FORMAL SEMANTICS

3.1.1.1: remove references to xdt:dayTimeDuration and xdt:yearMonthDuration
types

Appendix B.2: align the operator mapping table with the revised version in
the language document


USE CASES

Use Case 1: Timesheets

Scenario: Staff complete timesheets giving for each day, the time they start
and finish work. The query calculates the total hours worked and multiplies
this by an hourly rate of pay to calculate the payment due.

Source document:

<timesheets>
  <employee nr="12345" rate="10.50">
    <attendance date="2003-01-15" start="T09:00:00" end="T17:00:00"/>
    <attendance date="2003-01-16" start="T09:00:00" end="T18:00:00"/>
    <attendance date="2003-01-17" start="T09:00:00" end="T17:00:00"/>
    <attendance date="2003-01-18" start="T09:00:00" end="T16:00:00"/>
    <attendance date="2003-01-19" start="T09:00:00" end="T16:00:00"/>
  </employee>
</timesheets>

Solution:

sum( for $a in $emp/attendance
     return ($a/@end - $a/@start) div 3600) * $emp/@rate

Note that this works even for shifts that cross midnight.

Using the functions in the published WD, the subtraction of two times yields
a dayTimeDuration. It is not possible to multiply this directly by an hourly
rate to obtain the amount payable; instead it is necessary to extract the
individual components of the dayTimeDuration, use these to compute the
number of hours worked, and then do the multiplication.

Use Case 2: Average Speed

Scenario: The source document contains the start times and end times for
each driver in each stage of a motor rally. The query calculates the average
speed.

Source document

<timings>
  <driver name="Mansell">
    <stage distance="520.5" start="2002-10-15T09:02:04.1" 
         finish="2002-10-15T17:12:16.2"/>
    <stage distance="430.3" start="2002-11-15T08:32:05.7" 
         finish="2002-11-15T15:55:13.6"/>
  </driver>
</timings>

Solution:

avg(for $s in driver/stage return
      @distance div (@finish - @start)) div 3600

Using the functions in the published WD, the subtraction of two dateTimes
yields a dayTimeDuration. It is not possible to divide a distance by a
dayTimeDuration to obtain a speed; instead, it is necessary to convert the
duration to a number of seconds by extracting all the components to compute
the travel time in seconds.

These use cases demonstrate that although under this proposal the
specification has vastly fewer functions to manipulate durations, practical
calculations involving durations actually become significantly easier.
 


Michael Kay
Software AG

Received on Monday, 22 September 2003 06:22:25 UTC