RE: AW: Any interest in participating Canonical EXI interoperability test? canonical time representations

Hi Rich, Daniel, Don and all,

I modified OpenEXI according to the clarification, and checked in the library jars and the latest
encoding results into GitHub [1].

[1] https://github.com/w3c/exi-testsuite

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America


-----Original Message-----
From: Rich Rollman [mailto:richroll@agiledelta.com] 
Sent: Monday, December 18, 2017 12:07 PM
To: Takuki Kamiya <tkamiya@us.fujitsu.com>; 'Don Brutzman' <brutzman@nps.edu>
Cc: 'Peintner, Daniel' <daniel.peintner.ext@siemens.com>; 'Carine Bournez' <carine@w3.org>; public-exi@w3.org
Subject: RE: AW: Any interest in participating Canonical EXI interoperability test? canonical time representations

Hi Taki,

AgileDelta agrees with this approach. It preserves the information in the
original message.

Best regards,

-Rich

-----Original Message-----
From: Takuki Kamiya [mailto:tkamiya@us.fujitsu.com] 
Sent: Monday, December 18, 2017 11:22 AM
To: Don Brutzman
Cc: Peintner, Daniel; 'Carine Bournez'; public-exi@w3.org
Subject: RE: AW: Any interest in participating Canonical EXI
interoperability test? canonical time representations

Hi Don and all,

In today's call we discussed on the interoperability issue involving "60" as
the value of seconds that is a part of a date-time value representing a leap
second instance when canonicalizing a local time (with timezone) to UTC
time.

Since timezone values always have seconds part 0 (zero), the arithmetic
essentially never changes the seconds part, therefore, participants in the
discussion today all agreed that the best way is not to touch the seconds
part at all. This will allow for users to distinguish a leap second instance
and the second right after the leap second.

The seconds value "60" should always be used to represent a valid leap
seconds instance, and it is the responsibility of users who provide the
original date-time values.

I hope you also can agree to this outcome. Please let us know of your
opinion.

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America



-----Original Message-----
From: Don Brutzman [mailto:brutzman@nps.edu]
Sent: Monday, December 11, 2017 8:18 AM
To: Peintner, Daniel <daniel.peintner.ext@siemens.com>; Takuki Kamiya
<tkamiya@us.fujitsu.com>
Cc: 'Carine Bournez' <carine@w3.org>; public-exi@w3.org
Subject: Re: AW: Any interest in participating Canonical EXI
interoperability test? canonical time representations

[resending, apologies for not following with group when originally send 25
NOV 2017]

1. This is a important case that offers potential for Canonical EXI
improvement.

First, Wikipedia says:
	https://en.wikipedia.org/wiki/Canonical_form
	"In computer science, data that has more than one possible
representation can often be canonicalized into a completely unique
representation called its canonical form."
and, from disambiguation page,
	https://en.wikipedia.org/wiki/Canonical
	"Canonical form, data that has been canonicalized into a completely
unique representation, from a previous form that had more than one possible
representation"

If we can agree that there is a unique meaning to two representations of a
given time, we should use the same form.

	example: (1 minute 0 seconds) == 0 minute 60 seconds)

Numerous applications depend on unique comparable representation values for
a given time, not least of which are legal documents or machine messages
(such as those which might occur in IOT).  So Canonical EXI must address
this ambiguity.


2. Next, XML Schema reference says

	3.2.7.2 Canonical representation
	
https://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#dateTime-canonical-repr
esentation

===============================
	"Except for trailing fractional zero digits in the seconds
representation, '24:00:00' time representations, and timezone (for timezoned
values), the mapping from literals to values is one-to-one. Where there is
more than one possible representation, the canonical representation is as
follows:

	The 2-digit numeral representing the hour must not be '24';
	The fractional second string, if present, must not end in '0';
	for timezoned values, the timezone must be represented with 'Z' (All
timezoned dateTime values are UTC.).
===============================


3. Next, EXI canonical says

	7.1.8 Date-Time
	https://www.w3.org/TR/exi-c14n/#dt-dateTime

Table 7-3. Date-Time components
===============================
MonthDay Month * 32 + Day
	9-bit Unsigned Integer (7.1.9 n-bit Unsigned Integer) where day is a
value in the range 1-31 and month is a value in the range 1-12.
Time	((Hour * 64) + Minutes) * 64 + seconds
	17-bit Unsigned Integer (7.1.9 n-bit Unsigned Integer) where Hour is
a value in the range 0-24, Minutes is a value in the range 0-59 and seconds
is a value in the range 0-60 ===============================

So there are two duplicative (non-unique) entries in the five given above.


4. Recommended changes:

	"Hour is a value in the range 0-24"
to
	"Hour is a value in the range 0-23"

and
	"seconds is a value in the range 0-60"
to
	"seconds is a value in the range 0-59"

This is slightly different that the reconciliation below.  Neither an hour
value of 24 nor a second value of 60 would be allowed in a Canonical EXI
document.  Handling of such values would be to provide adjustments that
match canonical form.

5. Leap seconds is an interesting special case.  Note that negative values
are possible.  Occurrences are relatively rare, irregular spaced,
well-defined occurrences, and not predictable more than six months in
advance.

	https://en.wikipedia.org/wiki/Leap_second

	
https://en.wikipedia.org/wiki/Leap_second#Workarounds_for_leap_second_proble
ms

However, as shown above, relaxing the range restriction on the seconds or
hours range would lead to all manner of non-unique noncomparable
representations.

Conceivably some scheme can cope with these canonically.  However it does
not appear that overloading the seconds representation with a value of 60
can be allowed without causing many more non-uniqueness problems that it
solves.

The most prudent approach may be to include the following:

	WARNING
	This approach does not account for uniquely representing the
specific time values corresponding to leap seconds.  Separate data
representations may be necessary to account for such values.
	[reference Wikipedia page or ITU]

[Recommend pushing this email dialog to public list.  It is good to include
all of the back-and-forth so that we have a mail-archive record of the
examination of issues. I'm traveling so (for "timely" response!) please feel
free to add/include information in this post also.]

On 11/24/2017 1:16 AM, Peintner, Daniel wrote:
> All,
> 
> Before sending my email to to Rich and John I would like to confirm "our"
understanding first.
> 
> Below my proposed response.
> 
> Please comment.
> 
> Thanks,
> 
> -- Daniel
> 
> ___________
> 
> All,
> 
>> Regarding issue #1, section 4.5.5 of the EXI Canonical specification 
>> states
> 
>> that "The Hour value used to compute the Time component MUST NOT be 24".
> 
>> Based on that the test case is accurate. We also believe the 
>>specification to be
> 
>> accurate as the signature should not fail if a processor changes 24 
>>to 0 since
> 
>> these are equivalent values.
> 
> I also believe that the test-case is fine.
> 
> Moreover I do think that the signature MUST fail if the values differ (24
vs. 0). The general idea behind Canonical EXI is that the resulting octet
stream is byte-per-byte the same which would not be the case if we allow 24
and 0.
> 
> 
>  > Thanks for the explanation on issue #2. That gives us an 
> understanding of the
> 
>  > behavior of EXIficient and OpenEXI. However, the single line in the 
> specification
> 
>  > "If the Canonical EXI Option utcTime is equal to true, Date-Time 
> values must be
> 
>  > represented using Coordinated Universal Time (UTC, sometimes called 
> "Greenwich
> 
>  > Mean Time")." would not lead us to understand that the time zone 
> offset should be
> 
>  > added as a duration to the time to compute UTC. In addition, since 
> time zones offsets
> 
>  > are only expressed in hours and minutes, it follows that seconds 
> never come into play
> 
>  > when normalizing a date/time to UTC. In fact, normalizing the 
> seconds actually results
> 
>  > in a loss of information in that 60 in the seconds position is 
> meant to represent a leap
> 
>  > second.
> 
> In general we expect dateTime canonicalization based on
https://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#dateTime-canonical-repr
esentation.
> 
> The link Taki provided is one consequence of it.
> 
> 

all the best, Don
-- 
Don Brutzman  Naval Postgraduate School, Code USW/Br       brutzman@nps.edu
Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA   +1.831.656.2149
X3D graphics, virtual worlds, navy robotics http://faculty.nps.edu/brutzman

Received on Tuesday, 16 January 2018 00:56:15 UTC