AW: Communicating EXI-C14 options (was Re: Canonical EXI - CR Review)

Hi John,

I am very sorry for taking so long to reply.

I very much like your proposal and I started to look into it more closely. Doing so I have some further comments/questions.

1. Using element names instead of attributes

You proposed to use attributes for indicating "omitOptionsDocument" or  "utcTime".

I wonder whether there is any strong reason for doing so?

I would stick to elements because EXI 1.0 options document uses elements and it is more consistent to do the same. e.g.,

<exi-c14n:options xmlns:exi="http://www.w3.org/2009/exi"
    xmlns:exi-c14n="http://www.w3.org/2016/exi-c14n">
    <exi-c14n:omitOptionsDocument/>
    <exi-c14n:utcTime/>
    <exi:header>
        <exi:common>
            <exi:compression/>
            <exi:fragment/>
            <exi:schemaId>foo</exi:schemaId>
        </exi:common>
    </exi:header>
</exi-c14n:options>

2. Best Practices

In Canonical EXI we propose 3 best practices [1].

Option 1 - XML Signature element:
integrate the proposed XML format as is --> no issue

Option 2 - URI scheme fragment identifier
Reading your email I get the expression that this is not an option for you. That said, I think adding the proposed "Canonical EXI Options document" as fragment identifier makes still sense.

For example we could define that the Bytes (e.g., TWFuIGl) for "Canonical EXI Options document" should be added to the URI e.g,

http://www.w3.org/TR/exi-c14n#TWFuIGl

May I ask you to comment on that...

Option 3 - EXI options within EXI stream
The EXI options may be in the EXI stream. The Canonical EXI options need to be exchanged else-wise.

Thanks again for your proposal and continuous support!

-- Daniel

[1] https://www.w3.org/XML/EXI/docs/canonical/canonical-exi.html#exchangeEXIOptions





________________________________
Von: John Schneider [john.schneider@agiledelta.com]
Gesendet: Donnerstag, 19. Mai 2016 23:56
An: Peintner, Daniel (ext) (CT RDA NEC EMB-DE)
Cc: public-exi@w3.org
Betreff: Communicating EXI-C14 options (was Re: Canonical EXI - CR Review)

Daniel,

As indicated in our comments on the most recent draft of the Canonical EXI specification, we have some thoughts and a proposal we’d like to share regarding your question:

Do you or anyone else have a good proposal how we can best indicate those four options. Text versions like http://www.w3.org/TR/exi-c14n#WithoutEXIOptionsAndWithDatetimeNormalization get pretty verbose. Using numbers or characters such as http://www.w3.org/TR/exi-c14n#A do not convey any meaning.

I think this is a good question and I agree with your observations above. This is an area where I think the Canonical EXI spec. could use a little more definition.

We believe EXI Canonicalization needs a well-defined, unambiguous method for communicating the EXI-C14N options and EXI options used when creating a signature, so a validator can reliably verify the signature. Ideally, this method would support all the various use cases for communicating options, including those discussed in Appendix D.2. Specifically, it should allow users to communicate EXI-C14N and EXI options inside a CanonicalizationMethod element, within an EXI header, via an out-of-band protocol or in an overarching specification.

Right now, the Canonical EXI specification defines a set of URIs for communicating the EXI-C14N options, then describes a variety of possible approaches for communicating the EXI options in appendix D.2. Both the EXI-C14N and EXI options are required for validating a signature, so we think it would be helpful if the spec. was just a clear about communicating EXI options as it is about communicating EXI-C14N options. In addition, we think it would reduce complexity if the spec. defined a single method for communicating both of them (compared to, for example, using some combination of a URI and an EXI options document).

While it is possible to come up with a single method for encoding all the EXI options and EXI-C14N options in a URI, this would likely be rather cryptic and/or verbose for users and would require implementors to add a non-trivial amount of extra encoding/decoding logic. In addition, it is not clear a URI format would provide the best way to address all use cases (e.g. communicating options out of band).

As such, we’d like to propose the spec. define a single URI to identify the Canonical EXI algorithm and define a single EXI-C14N options document for communicating the EXI-C14N and EXI options. The Canonical EXI URI would be simply be:

"http://www.w3.org/TR/exi-c14n”

The EXI-C14N options would take the following form:

<exi-c14n:options omitOptionsDocument=“true”? utcTime=“true”?
xmlns:exi-c14n=“http://www.w3.org/2016/exi-c14n”
xmlns:exi=“http://www.w3.org/2009/exi”>

<exi:header> … </exi:header>?
</exi-c14n:options>

This document defines a single attribute for each EXI-C14N option and reuses the EXI options document for communicating all the EXI options. It can be used inside the CanonicalizationMethod element, in an out-of-band protocol or within an overarching specification. The <exi:header> element is optional and would be omitted in use cases where the EXI options are provided inside the EXI header of the transmitted EXI stream or via some other mechanism. The entire <exi-c14n:options> document would be omitted from the CanonicalizationMethod element when Canonical EXI is used as the transmission format or when the options are provided some other way (e.g., an out-of-band protocol).

The <exi-c14n:options> document can be transmitted very efficiently using EXI. For efficiency, the two optional attributes could accept a single fixed value of “true”, specified using an enumeration. When absent, they would take on the default value of “false”. Since the <exi:header> element is already implemented by all conforming EXI implementations, implementation complexity should be very low compared to introducing a completely new way to communicate EXI options.

I believe this approach would satisfy all our requirements. It provides a single, simple mechanism for communicating all the EXI-C14N and EXI options. It is easy for users to read and understand and has low implementation complexity. It works for all known use cases and allows users to send EXI options inside or outside the EXI stream. EXI options are always expressed in the same format regardless of how they are sent. It is consistent with the way EXI options are communicated and can be sent efficiently using EXI. And lastly, it can be easily extended to support more options if needed in the future without adding a lot of complexity.

I hope this proposal is helpful. I look forward to hearing your questions, comments and feedback.

Best wishes!,

John

AgileDelta, Inc.
john.schneider@agiledelta.com<mailto:john.schneider@agiledelta.com>
http://www.agiledelta.com<http://www.agiledelta.com/>


On May 18, 2016, at 1:44 PM, John Schneider <john.schneider@agiledelta.com<mailto:john.schneider@agiledelta.com>> wrote:

Daniel,

Thank you very much for taking the time to review our comments. I’m glad they were helpful! I’ve answered the questions you posed and provided some additional comments in-line below.

All the best!,

John


3. Section 1.2, title: Change “Need of Canonical EXI” to “Need for Canonical EXI”

Due to Don's comment changed in the meanwhile to "Motivation”.

Great. That works. Thanks!

8. Section 3, contraint #4: This constraint should be removed so the
spec. does not force users to include the schemaId in every EXI Options
Document. This was discussed during the comment period for the Last Call
Working Draft and there seemed to be general consensus applications should
be able to choose whether to include the header options and schemaId in
the Canonical EXI form [1][2].

Regarding the question posed in [1], I don’t believe we need a separate
Canonical EXI Identifier to indicate the presence of the schemaId. The
EXI Options document already has this capability, so the presence of the
schemaId can be signaled in the same way other EXI options used for
Canonicalization are signaled.

[1] https://lists.w3.org/Archives/Public/public-exi-comments/2015Oct/0008.html
[2] https://lists.w3.org/Archives/Public/public-exi-comments/2015Oct/0006.html

We had discussions in the working group which concluded as follow:

We will replace the statement "element schemaId MUST always be present" with "element schemaId SHOULD be present".
Further, we will add a warning note that speaks about why a schemaId should be present.


Thank you for thinking about this issue. Its an important one. I appreciate your willingness to relax the MUST to a SHOULD. I think this gets us closer to what is needed. However, it still implies that it is somehow incorrect or undesirable to omit the schema ID from the header.

As we discussed during the previous comment period, there are more secure, effective and efficient ways to ensure the same schema is used between signers/validators and encoders/decoders. These are needed and exist independently of Canonical EXI. If users have a more secure, effective and/or efficient way to ensure the same schema is used, I don’t think we should tell them they SHOULD use a less secure, less effective and/or less efficient way. Nor do I think we should imply their way is somehow incorrect or undesirable.

For those that are already using a more secure, effective and/or efficient method, including the schemaId in the header duplicates existing functionality, adds overhead and provides no benefit. I don’t believe Canonical EXI should try to dictate one, potentially duplicative, less efficient, less effective and/or less secure method for identifying schemas. We should let users design their architecture to be as reliable, efficient and secure as needed, especially if they really need a method that is more secure, reliable or efficient than that provided by the schemaId.

Note: As a related issue, we don’t believe we need an EXI Canonicalization Identifier to indicate whether the schemaId is present in the options document. The options document already has an explicit way to signal whether the schemaId is included — just like it has a way to signal all the other normal EXI options. We should treat the schemaId just like all the other options that can be expressed in the EXI options document.

15. Section 4.3.3.1, second sentence: The sentence says to “apply
lexical rule.” I assume this refers to implementing the semantics
associated with the EXI Preserve.lexicalValues option. However, the
text should be more specific so implementers have consistent interpretations.

I suggest removing the "apply lexical rule" part and change it to
"When the grammar in effect is a schema-informed grammar use whiteSpace facet if any to normalize whitespaces."


Great. Thanks! That clarifies the intent.

18. Section 4.5.5: The absence of any canonicalization of Date-Time types
introduces a few places where implementations might produce different EXI
outputs for the same Infoset, breaking signatures. We recommend the following
canonicalization rules be established for Date-Time types:

•The Hour value used to compute the Time component MUST NOT be 24.
•The optional FractionalSecs component MUST be omitted if its value is zero.
•If the utcTime EXI Canonicalization option is set to true, Date-Time values
must be represented using Coordinated Universal Time (UTC, sometimes called "Greenwich Mean Time").

Note: in accordance with our discussion during Last Call [3], there are some
use cases that will need to preserve timezones (especially those using
Canonical EXI for transport) and some that will need to normalize timezones
to UTC. We recommend a utcTime option be added to the Canonical EXI
specification, so we can satisfy both sets of users.

The utcTime option may be expressed with a new Canonical EXI Identifier:
http://www.w3.org/TR/exi-c14n#utcTime.

[3] https://lists.w3.org/Archives/Public/public-exi-comments/2015Sep/0001.html
(discussion under comment #12)

I agree with the canonicalization rules for hour and FractionalSecs.


Great. Thanks!

W.r.t. the last rule I wonder whether we need an additional statement about whether the optional component TimeZone MUST be omitted or zero.


Good question. If we did this, the signature validator would not be able to detect whether something had added, removed or modified TimeZones in a given document. Adding, removing or modifying TimeZones is a significant change the validator should be able to detect during signature validation. So, I don’t think EXI Canonicalization should normalize the existence of the TimeZone component.

Moreover, at TPAC we decided to go with 6 possible choices (see https://lists.w3.org/Archives/Public/public-exi-comments/2015Oct/0008.html) and found it too complicated. Even if we remove the option for schemaId we still have 4 possibilities.

1. default (with EXI options and no datetime normalization)
2. without EXI options
3. with utcTime normalization
4. without EXI options AND  with utcTime normalization

Do you or anyone else have a good proposal how we can best indicate those four options. Text versions like http://www.w3.org/TR/exi-c14n#WithoutEXIOptionsAndWithDatetimeNormalization get pretty verbose. Using numbers or characters such as http://www.w3.org/TR/exi-c14n#A do not convey any meaning.

This is a another good question and an area where I think the spec. could use more definition. Right now, Appendix D.2 outlines a lot of options, but does not really provide a single, complete solution. We've spent some time thinking about this issue and have a proposal I'd like to share. Rather than clutter our feedback here, I will send a separate e-mail on this topic.

21. Section 4.5.6, first sentence: Change “Restricted Character Sets in EXI enable to restrict …” to “Restricted Character Sets in EXI enable one to restrict …"

Don proposed to change it to
"Restricted Character Sets are applied in EXI to restrict …”

Perfect. Thanks.

25. Section C.3: It logically follows that step 2 will follow step 1 and step 5 will follow step 4, so its not clear why we are “jumping” from step 1 to step 2 and from step 4 to step 5. Perhaps, we’re just happy? :-)

Removed "implicit" jumps even though jumping makes happy :-)

Yeah, me too. I think we can all agree on that! :-)


26. Section D.2: There are two other methods you may want to include for communicating EXI options.

First, a community of interest might decide on a set of Canonical EXI
options that are appropriate for their use case and codify them in their
specifications / standards. Implementations that comply with these
specifications / standards will all use the same options, without the need
for communicating them dynamically at runtime.

Second, a community of interest may devise a protocol for exchanging the
Canonical EXI options dynamically out-of-band as needed.

Also based on the input from Don I added a last section "D.2.4 Decision Criteria" which states that there might be other methods for communicating EXI options by also adding your proposals.

Great. Thanks again!


Thanks!















________________________________
Von: John Schneider [john.schneider@agiledelta.com<mailto:john.schneider@agiledelta.com>]
Gesendet: Mittwoch, 20. April 2016 00:34
An: Peintner, Daniel (ext)
Cc: public-exi@w3.org<mailto:public-exi@w3.org>
Betreff: Re: Canonical EXI - CR Review

Daniel,

Thank you for the opportunity to review an editor’s draft of the Canonical EXI specification before moving to Candidate Recommendation. This version of the specification is a fantastic improvement over the previous version. In addition to the specific improvements discussed during Last Call, it is now based on the Infoset, making it more general and more widely applicable. In addition, it is more declarative, reducing opportunities for over-specification. Thank you for all your work on this draft and thanks once again for incorporating our comments and user feedback on the Last Call Working Draft.

We’ve completed a thorough review of the subject editor’s draft and have some comments we’d like to share. I hope our comments help to improve the overall quality of the Canonical EXI specification and move it forward to CR.

Best wishes!,

John

AgileDelta, Inc.
john.schneider@agiledelta.com<mailto:john.schneider@agiledelta.com><mailto:john.schneider@agiledelta.com>
http://www.agiledelta.com<http://www.agiledelta.com/><http://www.agiledelta.com%3chttp//www.agiledelta.com/%3E>

——— Detailed Comments ———

1. Abstract, 2nd sentence: Change “… accounts for the permissible changes.” to “… accounts for the permissible differences.”

2. Section 1, sentence 2: Change “… streams which are equivalent …” to “… streams that are equivalent …”

3. Section 1.2, title: Change “Need of Canonical EXI” to “Need for Canonical EXI”

4. Section 1.2, sentence 1: Change “… have difficulties to handle …” to “… have difficulties handling …”

5. Section 1.3, paragraph 1, last sentence: Add a comma after “If there is equivalence”

6. Section 1.3, last sentence: Change “… this strategy is not well suited …” to “… this strategy is not the most efficient and is not well suited …”

7. Section 3, paragraph 3, sentence 2: Change to “If the Canonical EXI Identifier is equal to http://www.w3.org/TR/exi-c14n, the Presence Bit for the EXI Options MUST be 1 (true) to indicate the EXI Options are present.” Stating this condition in the positive avoids over specifying the behavior of the presence bit e.g. for environments that don’t specify a Canonical EXI Identifier at all.

8. Section 3, contraint #4: This constraint should be removed so the spec. does not force users to include the schemaId in every EXI Options Document. This was discussed during the comment period for the Last Call Working Draft and there seemed to be general consensus applications should be able to choose whether to include the header options and schemaId in the Canonical EXI form [1][2].

Regarding the question posed in [1], I don’t believe we need a separate Canonical EXI Identifier to indicate the presence of the schemaId. The EXI Options document already has this capability, so the presence of the schemaId can be signaled in the same way other EXI options used for Canonicalization are signaled.

[1] https://lists.w3.org/Archives/Public/public-exi-comments/2015Oct/0008.html
[2] https://lists.w3.org/Archives/Public/public-exi-comments/2015Oct/0006.html

9. Section 4.2, paragraph 2: The meaning of the nomenclature “event (-code)” is not clear. Recommend clarifying or removing this nomenclature.

10. Section 4.2.1, note, sentence 2: Change “The specifications mentions …” to “The specification mentions …”

11. Section 4.3.3, sentence 2: Change “One exception to this statement are …” to “One exception to this statement is …”

12. Section 4.3.3: Sentence 3 indicates Canonical EXI *SHOULD* follow the whitespace rules, then sentence 4 says xml:space=“preserve” *MUST* be respected, but provides some exceptions. To be more precise, recommend wording more like this: “Except as specified below, Canonical EXI MUST respect xml:space=‘preserve’”, then provide the exceptions. We must be careful to avoid *SHOULD* in this spec., as this can lead to differences in implementations that will break signatures.

13. Section 4.3.3, bullet 2: Change “Use-cases requiring whitespace might considering to use …” to “Use-cases requiring whitespace preservation might consider using…”

14. Section 4.3.3, last sentence: Change “When the current xml:space is not “preserve” we differ between  …” to “When the current value of xml:space is not “preserve”, different rules apply for …”

15. Section 4.3.3.1, second sentence: The sentence says to “apply lexical rule.” I assume this refers to implementing the semantics associated with the EXI Preserve.lexicalValues option. However, the text should be more specific so implementers have consistent interpretations.

16. Section 4.5.3, bullet 1: Change “A sign value of one (1) MUST be changed to zero (0) if both …” to “The sign value MUST be zero (0) if both …” Declarative vs. procedural.

17. Section 4.5.4, bullets 1 and 2: Change “… MUST be changed to 0.” to “is not permitted.” Declarative vs. procedural.

18. Section 4.5.5: The absence of any canonicalization of Date-Time types introduces a few places where implementations might produce different EXI outputs for the same Infoset, breaking signatures. We recommend the following canonicalization rules be established for Date-Time types:


*   The Hour value used to compute the Time component MUST NOT be 24.
*   The optional FractionalSecs component MUST be omitted if its value is zero.
*   If the utcTime EXI Canonicalization option is set to true, Date-Time values must be represented using Coordinated Universal Time (UTC, sometimes called "Greenwich Mean Time").

Note: in accordance with our discussion during Last Call [3], there are some use cases that will need to preserve timezones (especially those using Canonical EXI for transport) and some that will need to normalize timezones to UTC. We recommend a utcTime option be added to the Canonical EXI specification, so we can satisfy both sets of users.

The utcTime option may be expressed with a new Canonical EXI Identifier: http://www.w3.org/TR/exi-c14n#utcTime.

[3] https://lists.w3.org/Archives/Public/public-exi-comments/2015Sep/0001.html (discussion under comment #12)

19. Section 4.5.6, paragraph 2, sentence 3: Change “… a string value MUST be using a compact identifier …” to “… a string value MUST be represented using a compact identifier …”

20. Section 4.5.6, bullet 1, last sentence: Change “… one local partitions hit …” to “… one local partition entry …”

21. Section 4.5.6, first sentence: Change “Restricted Character Sets in EXI enable to restrict …” to “Restricted Character Sets in EXI enable one to restrict …"

22. Section B.1, paragraph 2: Change “… does not use plain text XML data and its associated overhead.” to “… does not require the overhead of plain text XML.”

22. Section B.1, Caution statement: Rather than providing a caution statement that warns of some unspecified danger, recommend providing a note that identifies what a user must do to successfully use Canonical EXI with XML intermediaries. For example,

"Note: In environments that use Canonical EXI for signing and have intermediary nodes that represent the associated Infoset using text XML, it is important to ensure the Canonical EXI signer and validator use the same set of options (see section D.2)."

23. Section B.2, paragraph 2: Change “… character model normalization has been moved out of scope ...” to “… character model normalization is out of scope …”

24. Section B.3: Recommend deleting this section. See comment 18.

25. Section C.3: It logically follows that step 2 will follow step 1 and step 5 will follow step 4, so its not clear why we are “jumping” from step 1 to step 2 and from step 4 to step 5. Perhaps, we’re just happy? :-)

26. Section D.2: There are two other methods you may want to include for communicating EXI options.

First, a community of interest might decide on a set of Canonical EXI options that are appropriate for their use case and codify them in their specifications / standards. Implementations that comply with these specifications / standards will all use the same options, without the need for communicating them dynamically at runtime.

Second, a community of interest may devise a protocol for exchanging the Canonical EXI options dynamically out-of-band as needed.



On Mar 30, 2016, at 3:24 AM, Peintner, Daniel (ext) <daniel.peintner.ext@siemens.com<mailto:daniel.peintner.ext@siemens.com>> wrote:

All,

With the latest updates I believe we resolved all issues w.r.t. to Canonical EXI.

Before moving to Candidate Recommendation (CR) I ask everyone to do a review of the document [1].

A diff compared to the last call document can be found here [2].

Thanks,

-- Daniel

[1] https://www.w3.org/XML/EXI/docs/canonical/canonical-exi.html
[2] http://services.w3.org/htmldiff?doc1=http://www.w3.org/TR/exi-c14n/&doc2=https://www.w3.org/XML/EXI/docs/canonical/canonical-exi.html ~

Received on Friday, 1 July 2016 13:48:31 UTC