Re: Canonical EXI - CR Review

Daniel,

Thank you very much for taking the time to review our comments. I’m glad they were helpful! I’ve answered the questions you posed and provided some additional comments in-line below. 

	All the best!,

	John


>> 3. Section 1.2, title: Change “Need of Canonical EXI” to “Need for Canonical EXI”
> 
> Due to Don's comment changed in the meanwhile to "Motivation”.

Great. That works. Thanks! 

>> 8. Section 3, contraint #4: This constraint should be removed so the
>> spec. does not force users to include the schemaId in every EXI Options
>> Document. This was discussed during the comment period for the Last Call
>> Working Draft and there seemed to be general consensus applications should
>> be able to choose whether to include the header options and schemaId in
>> the Canonical EXI form [1][2].
> 
>> Regarding the question posed in [1], I don’t believe we need a separate
>> Canonical EXI Identifier to indicate the presence of the schemaId. The
>> EXI Options document already has this capability, so the presence of the
>> schemaId can be signaled in the same way other EXI options used for
>> Canonicalization are signaled.
>> 
>> [1] https://lists.w3.org/Archives/Public/public-exi-comments/2015Oct/0008.html
>> [2] https://lists.w3.org/Archives/Public/public-exi-comments/2015Oct/0006.html
> 
> We had discussions in the working group which concluded as follow:
> 
> We will replace the statement "element schemaId MUST always be present" with "element schemaId SHOULD be present".
> Further, we will add a warning note that speaks about why a schemaId should be present.
> 

Thank you for thinking about this issue. Its an important one. I appreciate your willingness to relax the MUST to a SHOULD. I think this gets us closer to what is needed. However, it still implies that it is somehow incorrect or undesirable to omit the schema ID from the header. 

As we discussed during the previous comment period, there are more secure, effective and efficient ways to ensure the same schema is used between signers/validators and encoders/decoders. These are needed and exist independently of Canonical EXI. If users have a more secure, effective and/or efficient way to ensure the same schema is used, I don’t think we should tell them they SHOULD use a less secure, less effective and/or less efficient way. Nor do I think we should imply their way is somehow incorrect or undesirable. 

For those that are already using a more secure, effective and/or efficient method, including the schemaId in the header duplicates existing functionality, adds overhead and provides no benefit. I don’t believe Canonical EXI should try to dictate one, potentially duplicative, less efficient, less effective and/or less secure method for identifying schemas. We should let users design their architecture to be as reliable, efficient and secure as needed, especially if they really need a method that is more secure, reliable or efficient than that provided by the schemaId. 

Note: As a related issue, we don’t believe we need an EXI Canonicalization Identifier to indicate whether the schemaId is present in the options document. The options document already has an explicit way to signal whether the schemaId is included — just like it has a way to signal all the other normal EXI options. We should treat the schemaId just like all the other options that can be expressed in the EXI options document. 

>> 15. Section 4.3.3.1, second sentence: The sentence says to “apply
>> lexical rule.” I assume this refers to implementing the semantics
>> associated with the EXI Preserve.lexicalValues option. However, the
>> text should be more specific so implementers have consistent interpretations.
> 
> I suggest removing the "apply lexical rule" part and change it to
> "When the grammar in effect is a schema-informed grammar use whiteSpace facet if any to normalize whitespaces."
> 

Great. Thanks! That clarifies the intent.

>> 18. Section 4.5.5: The absence of any canonicalization of Date-Time types
>> introduces a few places where implementations might produce different EXI
>> outputs for the same Infoset, breaking signatures. We recommend the following
>> canonicalization rules be established for Date-Time types:
>> 
>> •The Hour value used to compute the Time component MUST NOT be 24.
>> •The optional FractionalSecs component MUST be omitted if its value is zero.
>> •If the utcTime EXI Canonicalization option is set to true, Date-Time values
>> must be represented using Coordinated Universal Time (UTC, sometimes called "Greenwich Mean Time").
> 
>> Note: in accordance with our discussion during Last Call [3], there are some
>> use cases that will need to preserve timezones (especially those using
>> Canonical EXI for transport) and some that will need to normalize timezones
>> to UTC. We recommend a utcTime option be added to the Canonical EXI
>> specification, so we can satisfy both sets of users.
>> 
>> The utcTime option may be expressed with a new Canonical EXI Identifier:
>> http://www.w3.org/TR/exi-c14n#utcTime.
>> 
>> [3] https://lists.w3.org/Archives/Public/public-exi-comments/2015Sep/0001.html
>> (discussion under comment #12)
> 
> I agree with the canonicalization rules for hour and FractionalSecs.
> 

Great. Thanks!

> W.r.t. the last rule I wonder whether we need an additional statement about whether the optional component TimeZone MUST be omitted or zero.
> 

Good question. If we did this, the signature validator would not be able to detect whether something had added, removed or modified TimeZones in a given document. Adding, removing or modifying TimeZones is a significant change the validator should be able to detect during signature validation. So, I don’t think EXI Canonicalization should normalize the existence of the TimeZone component. 

> Moreover, at TPAC we decided to go with 6 possible choices (see https://lists.w3.org/Archives/Public/public-exi-comments/2015Oct/0008.html) and found it too complicated. Even if we remove the option for schemaId we still have 4 possibilities.
> 
> 1. default (with EXI options and no datetime normalization)
> 2. without EXI options
> 3. with utcTime normalization
> 4. without EXI options AND  with utcTime normalization
> 
> Do you or anyone else have a good proposal how we can best indicate those four options. Text versions like http://www.w3.org/TR/exi-c14n#WithoutEXIOptionsAndWithDatetimeNormalization get pretty verbose. Using numbers or characters such as http://www.w3.org/TR/exi-c14n#A do not convey any meaning.

This is a another good question and an area where I think the spec. could use more definition. Right now, Appendix D.2 outlines a lot of options, but does not really provide a single, complete solution. We've spent some time thinking about this issue and have a proposal I'd like to share. Rather than clutter our feedback here, I will send a separate e-mail on this topic. 

> 21. Section 4.5.6, first sentence: Change “Restricted Character Sets in EXI enable to restrict …” to “Restricted Character Sets in EXI enable one to restrict …"
> 
> Don proposed to change it to
> "Restricted Character Sets are applied in EXI to restrict …”

Perfect. Thanks.

> 25. Section C.3: It logically follows that step 2 will follow step 1 and step 5 will follow step 4, so its not clear why we are “jumping” from step 1 to step 2 and from step 4 to step 5. Perhaps, we’re just happy? :-)
> 
> Removed "implicit" jumps even though jumping makes happy :-)

Yeah, me too. I think we can all agree on that! :-)

> 
>> 26. Section D.2: There are two other methods you may want to include for communicating EXI options.
>> 
>> First, a community of interest might decide on a set of Canonical EXI
>> options that are appropriate for their use case and codify them in their
>> specifications / standards. Implementations that comply with these
>> specifications / standards will all use the same options, without the need
>> for communicating them dynamically at runtime.
>> 
>> Second, a community of interest may devise a protocol for exchanging the
>> Canonical EXI options dynamically out-of-band as needed.
> 
> Also based on the input from Don I added a last section "D.2.4 Decision Criteria" which states that there might be other methods for communicating EXI options by also adding your proposals.

Great. Thanks again!

> 
> Thanks!
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ________________________________
> Von: John Schneider [john.schneider@agiledelta.com]
> Gesendet: Mittwoch, 20. April 2016 00:34
> An: Peintner, Daniel (ext)
> Cc: public-exi@w3.org
> Betreff: Re: Canonical EXI - CR Review
> 
> Daniel,
> 
> Thank you for the opportunity to review an editor’s draft of the Canonical EXI specification before moving to Candidate Recommendation. This version of the specification is a fantastic improvement over the previous version. In addition to the specific improvements discussed during Last Call, it is now based on the Infoset, making it more general and more widely applicable. In addition, it is more declarative, reducing opportunities for over-specification. Thank you for all your work on this draft and thanks once again for incorporating our comments and user feedback on the Last Call Working Draft.
> 
> We’ve completed a thorough review of the subject editor’s draft and have some comments we’d like to share. I hope our comments help to improve the overall quality of the Canonical EXI specification and move it forward to CR.
> 
> Best wishes!,
> 
> John
> 
> AgileDelta, Inc.
> john.schneider@agiledelta.com<mailto:john.schneider@agiledelta.com>
> http://www.agiledelta.com<http://www.agiledelta.com/>
> 
> ——— Detailed Comments ———
> 
> 1. Abstract, 2nd sentence: Change “… accounts for the permissible changes.” to “… accounts for the permissible differences.”
> 
> 2. Section 1, sentence 2: Change “… streams which are equivalent …” to “… streams that are equivalent …”
> 
> 3. Section 1.2, title: Change “Need of Canonical EXI” to “Need for Canonical EXI”
> 
> 4. Section 1.2, sentence 1: Change “… have difficulties to handle …” to “… have difficulties handling …”
> 
> 5. Section 1.3, paragraph 1, last sentence: Add a comma after “If there is equivalence”
> 
> 6. Section 1.3, last sentence: Change “… this strategy is not well suited …” to “… this strategy is not the most efficient and is not well suited …”
> 
> 7. Section 3, paragraph 3, sentence 2: Change to “If the Canonical EXI Identifier is equal to http://www.w3.org/TR/exi-c14n, the Presence Bit for the EXI Options MUST be 1 (true) to indicate the EXI Options are present.” Stating this condition in the positive avoids over specifying the behavior of the presence bit e.g. for environments that don’t specify a Canonical EXI Identifier at all.
> 
> 8. Section 3, contraint #4: This constraint should be removed so the spec. does not force users to include the schemaId in every EXI Options Document. This was discussed during the comment period for the Last Call Working Draft and there seemed to be general consensus applications should be able to choose whether to include the header options and schemaId in the Canonical EXI form [1][2].
> 
> Regarding the question posed in [1], I don’t believe we need a separate Canonical EXI Identifier to indicate the presence of the schemaId. The EXI Options document already has this capability, so the presence of the schemaId can be signaled in the same way other EXI options used for Canonicalization are signaled.
> 
> [1] https://lists.w3.org/Archives/Public/public-exi-comments/2015Oct/0008.html
> [2] https://lists.w3.org/Archives/Public/public-exi-comments/2015Oct/0006.html
> 
> 9. Section 4.2, paragraph 2: The meaning of the nomenclature “event (-code)” is not clear. Recommend clarifying or removing this nomenclature.
> 
> 10. Section 4.2.1, note, sentence 2: Change “The specifications mentions …” to “The specification mentions …”
> 
> 11. Section 4.3.3, sentence 2: Change “One exception to this statement are …” to “One exception to this statement is …”
> 
> 12. Section 4.3.3: Sentence 3 indicates Canonical EXI *SHOULD* follow the whitespace rules, then sentence 4 says xml:space=“preserve” *MUST* be respected, but provides some exceptions. To be more precise, recommend wording more like this: “Except as specified below, Canonical EXI MUST respect xml:space=‘preserve’”, then provide the exceptions. We must be careful to avoid *SHOULD* in this spec., as this can lead to differences in implementations that will break signatures.
> 
> 13. Section 4.3.3, bullet 2: Change “Use-cases requiring whitespace might considering to use …” to “Use-cases requiring whitespace preservation might consider using…”
> 
> 14. Section 4.3.3, last sentence: Change “When the current xml:space is not “preserve” we differ between  …” to “When the current value of xml:space is not “preserve”, different rules apply for …”
> 
> 15. Section 4.3.3.1, second sentence: The sentence says to “apply lexical rule.” I assume this refers to implementing the semantics associated with the EXI Preserve.lexicalValues option. However, the text should be more specific so implementers have consistent interpretations.
> 
> 16. Section 4.5.3, bullet 1: Change “A sign value of one (1) MUST be changed to zero (0) if both …” to “The sign value MUST be zero (0) if both …” Declarative vs. procedural.
> 
> 17. Section 4.5.4, bullets 1 and 2: Change “… MUST be changed to 0.” to “is not permitted.” Declarative vs. procedural.
> 
> 18. Section 4.5.5: The absence of any canonicalization of Date-Time types introduces a few places where implementations might produce different EXI outputs for the same Infoset, breaking signatures. We recommend the following canonicalization rules be established for Date-Time types:
> 
> 
>  *   The Hour value used to compute the Time component MUST NOT be 24.
>  *   The optional FractionalSecs component MUST be omitted if its value is zero.
>  *   If the utcTime EXI Canonicalization option is set to true, Date-Time values must be represented using Coordinated Universal Time (UTC, sometimes called "Greenwich Mean Time").
> 
> Note: in accordance with our discussion during Last Call [3], there are some use cases that will need to preserve timezones (especially those using Canonical EXI for transport) and some that will need to normalize timezones to UTC. We recommend a utcTime option be added to the Canonical EXI specification, so we can satisfy both sets of users.
> 
> The utcTime option may be expressed with a new Canonical EXI Identifier: http://www.w3.org/TR/exi-c14n#utcTime.
> 
> [3] https://lists.w3.org/Archives/Public/public-exi-comments/2015Sep/0001.html (discussion under comment #12)
> 
> 19. Section 4.5.6, paragraph 2, sentence 3: Change “… a string value MUST be using a compact identifier …” to “… a string value MUST be represented using a compact identifier …”
> 
> 20. Section 4.5.6, bullet 1, last sentence: Change “… one local partitions hit …” to “… one local partition entry …”
> 
> 21. Section 4.5.6, first sentence: Change “Restricted Character Sets in EXI enable to restrict …” to “Restricted Character Sets in EXI enable one to restrict …"
> 
> 22. Section B.1, paragraph 2: Change “… does not use plain text XML data and its associated overhead.” to “… does not require the overhead of plain text XML.”
> 
> 22. Section B.1, Caution statement: Rather than providing a caution statement that warns of some unspecified danger, recommend providing a note that identifies what a user must do to successfully use Canonical EXI with XML intermediaries. For example,
> 
> "Note: In environments that use Canonical EXI for signing and have intermediary nodes that represent the associated Infoset using text XML, it is important to ensure the Canonical EXI signer and validator use the same set of options (see section D.2)."
> 
> 23. Section B.2, paragraph 2: Change “… character model normalization has been moved out of scope ...” to “… character model normalization is out of scope …”
> 
> 24. Section B.3: Recommend deleting this section. See comment 18.
> 
> 25. Section C.3: It logically follows that step 2 will follow step 1 and step 5 will follow step 4, so its not clear why we are “jumping” from step 1 to step 2 and from step 4 to step 5. Perhaps, we’re just happy? :-)
> 
> 26. Section D.2: There are two other methods you may want to include for communicating EXI options.
> 
> First, a community of interest might decide on a set of Canonical EXI options that are appropriate for their use case and codify them in their specifications / standards. Implementations that comply with these specifications / standards will all use the same options, without the need for communicating them dynamically at runtime.
> 
> Second, a community of interest may devise a protocol for exchanging the Canonical EXI options dynamically out-of-band as needed.
> 
> 
> 
> On Mar 30, 2016, at 3:24 AM, Peintner, Daniel (ext) <daniel.peintner.ext@siemens.com<mailto:daniel.peintner.ext@siemens.com>> wrote:
> 
> All,
> 
> With the latest updates I believe we resolved all issues w.r.t. to Canonical EXI.
> 
> Before moving to Candidate Recommendation (CR) I ask everyone to do a review of the document [1].
> 
> A diff compared to the last call document can be found here [2].
> 
> Thanks,
> 
> -- Daniel
> 
> [1] https://www.w3.org/XML/EXI/docs/canonical/canonical-exi.html
> [2] http://services.w3.org/htmldiff?doc1=http://www.w3.org/TR/exi-c14n/&doc2=https://www.w3.org/XML/EXI/docs/canonical/canonical-exi.html ~
> 
> 
> 
> 
> AgileDelta, Inc.
> john.schneider@agiledelta.com<mailto:john.schneider@agiledelta.com>
> http://www.agiledelta.com<http://www.agiledelta.com/>
> w: 425-644-7122
> m: 425-503-3403
> f: 425-644-7126
> 
> 
> 

AgileDelta, Inc.
john.schneider@agiledelta.com
http://www.agiledelta.com
w: 425-644-7122
m: 425-503-3403
f: 425-644-7126

Received on Wednesday, 18 May 2016 20:44:40 UTC