RE: AW: Any interest in participating Canonical EXI interoperability test? canonical time representations

Hi Don and all,

In today's call we discussed on the interoperability issue involving "60" as the
value of seconds that is a part of a date-time value representing a leap second
instance when canonicalizing a local time (with timezone) to UTC time.

Since timezone values always have seconds part 0 (zero), the arithmetic essentially
never changes the seconds part, therefore, participants in the discussion today all agreed
that the best way is not to touch the seconds part at all. This will allow for users to
distinguish a leap second instance and the second right after the leap second.

The seconds value "60" should always be used to represent a valid leap seconds
instance, and it is the responsibility of users who provide the original date-time values.

I hope you also can agree to this outcome. Please let us know of your opinion.

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America



-----Original Message-----
From: Don Brutzman [mailto:brutzman@nps.edu] 
Sent: Monday, December 11, 2017 8:18 AM
To: Peintner, Daniel <daniel.peintner.ext@siemens.com>; Takuki Kamiya <tkamiya@us.fujitsu.com>
Cc: 'Carine Bournez' <carine@w3.org>; public-exi@w3.org
Subject: Re: AW: Any interest in participating Canonical EXI interoperability test? canonical time representations

[resending, apologies for not following with group when originally send 25 NOV 2017]

1. This is a important case that offers potential for Canonical EXI improvement.

First, Wikipedia says:
	https://en.wikipedia.org/wiki/Canonical_form
	"In computer science, data that has more than one possible representation can often be canonicalized into a completely unique representation called its canonical form."
and, from disambiguation page,
	https://en.wikipedia.org/wiki/Canonical
	"Canonical form, data that has been canonicalized into a completely unique representation, from a previous form that had more than one possible representation"

If we can agree that there is a unique meaning to two representations of a given time, we should use the same form.

	example: (1 minute 0 seconds) == 0 minute 60 seconds)

Numerous applications depend on unique comparable representation values for a given time, not least of which are legal documents or machine messages (such as those which might occur in IOT).  So Canonical EXI must address this ambiguity.


2. Next, XML Schema reference says

	3.2.7.2 Canonical representation
	https://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#dateTime-canonical-representation

===============================
	"Except for trailing fractional zero digits in the seconds representation, '24:00:00' time representations, and timezone (for timezoned values), the mapping from literals to values is one-to-one. Where there is more than one possible representation, the canonical representation is as follows:

	The 2-digit numeral representing the hour must not be '24';
	The fractional second string, if present, must not end in '0';
	for timezoned values, the timezone must be represented with 'Z' (All timezoned dateTime values are UTC.).
===============================


3. Next, EXI canonical says

	7.1.8 Date-Time
	https://www.w3.org/TR/exi-c14n/#dt-dateTime

Table 7-3. Date-Time components
===============================
MonthDay Month * 32 + Day
	9-bit Unsigned Integer (7.1.9 n-bit Unsigned Integer) where day is a value in the range 1-31 and month is a value in the range 1-12.
Time	((Hour * 64) + Minutes) * 64 + seconds
	17-bit Unsigned Integer (7.1.9 n-bit Unsigned Integer) where Hour is a value in the range 0-24, Minutes is a value in the range 0-59 and seconds is a value in the range 0-60
===============================

So there are two duplicative (non-unique) entries in the five given above.


4. Recommended changes:

	"Hour is a value in the range 0-24"
to
	"Hour is a value in the range 0-23"

and
	"seconds is a value in the range 0-60"
to
	"seconds is a value in the range 0-59"

This is slightly different that the reconciliation below.  Neither an hour value of 24 nor a second value of 60 would be allowed in a Canonical EXI document.  Handling of such values would be to provide adjustments that match canonical form.

5. Leap seconds is an interesting special case.  Note that negative values are possible.  Occurrences are relatively rare, irregular spaced, well-defined occurrences, and not predictable more than six months in advance.

	https://en.wikipedia.org/wiki/Leap_second

	https://en.wikipedia.org/wiki/Leap_second#Workarounds_for_leap_second_problems

However, as shown above, relaxing the range restriction on the seconds or hours range would lead to all manner of non-unique noncomparable representations.

Conceivably some scheme can cope with these canonically.  However it does not appear that overloading the seconds representation with a value of 60 can be allowed without causing many more non-uniqueness problems that it solves.

The most prudent approach may be to include the following:

	WARNING
	This approach does not account for uniquely representing the specific time values corresponding to leap seconds.  Separate data representations may be necessary to account for such values.
	[reference Wikipedia page or ITU]

[Recommend pushing this email dialog to public list.  It is good to include all of the back-and-forth so that we have a mail-archive record of the examination of issues. I'm traveling so (for "timely" response!) please feel free to add/include information in this post also.]

On 11/24/2017 1:16 AM, Peintner, Daniel wrote:
> All,
> 
> Before sending my email to to Rich and John I would like to confirm "our" understanding first.
> 
> Below my proposed response.
> 
> Please comment.
> 
> Thanks,
> 
> -- Daniel
> 
> ___________
> 
> All,
> 
>> Regarding issue #1, section 4.5.5 of the EXI Canonical specification states
> 
>> that "The Hour value used to compute the Time component MUST NOT be 24".
> 
>> Based on that the test case is accurate. We also believe the specification to be
> 
>> accurate as the signature should not fail if a processor changes 24 to 0 since
> 
>> these are equivalent values.
> 
> I also believe that the test-case is fine.
> 
> Moreover I do think that the signature MUST fail if the values differ (24 vs. 0). The general idea behind Canonical EXI is that the resulting octet stream is byte-per-byte the same which would not be the case if we allow 24 and 0.
> 
> 
>  > Thanks for the explanation on issue #2. That gives us an understanding of the
> 
>  > behavior of EXIficient and OpenEXI. However, the single line in the specification
> 
>  > "If the Canonical EXI Option utcTime is equal to true, Date-Time values must be
> 
>  > represented using Coordinated Universal Time (UTC, sometimes called "Greenwich
> 
>  > Mean Time")." would not lead us to understand that the time zone offset should be
> 
>  > added as a duration to the time to compute UTC. In addition, since time zones offsets
> 
>  > are only expressed in hours and minutes, it follows that seconds never come into play
> 
>  > when normalizing a date/time to UTC. In fact, normalizing the seconds actually results
> 
>  > in a loss of information in that 60 in the seconds position is meant to represent a leap
> 
>  > second.
> 
> In general we expect dateTime canonicalization based on https://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#dateTime-canonical-representation.
> 
> The link Taki provided is one consequence of it.
> 
> 

all the best, Don
-- 
Don Brutzman  Naval Postgraduate School, Code USW/Br       brutzman@nps.edu
Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA   +1.831.656.2149
X3D graphics, virtual worlds, navy robotics http://faculty.nps.edu/brutzman

Received on Monday, 18 December 2017 19:23:06 UTC