- From: John Schneider <john.schneider@agiledelta.com>
- Date: Wed, 23 Dec 2015 09:25:25 -0800
- To: Alessandro Triglia <alessandrot60@live.com>
- Cc: public-exi@w3.org
- Message-Id: <772B8BD8-7ECA-4E9B-BD50-7A3EABA138E2@agiledelta.com>
Alessandro, Its nice to hear from you! Yes, I’d be happy to clarify. The purpose of the statement is to precisely and unambiguously identify the conditions under which EE is permitted and thus, identify when there are two different ways to encode an empty element. When the first condition is true (strict=false), there is always a production of the form LeftHandSide : EE available in the current element grammar with an event code of length 2 (IAW [1]). When strict is true, there is no production of this form with event code of length 2 in the current element grammar and the only time EE is available is when there is a production of the form LeftHandSide : EE in the current element grammar with event code of length 1. So, the reason why the EE event has to have an event code of length 1 is because this is the only possibly allowed by the EXI grammars when strict is true. And yes, the phrase “current element grammar” is used throughout the EXI spec to identify the current non-terminal. That said, as I was thinking about your question it struck me that there is a better way to express this condition. In particular, we could just say: “When the current element grammar contains a production of the form LeftHandSide : EE, EXI can represent the content of an empty element explicitly as an empty CH event or implicitly as a SE event immediately followed by an EE event. In these circumstances, Canonical EXI MUST represent an empty element by a SE event followed by an EE event.” I think this would be an even simpler, more general and perhaps clearer way to specify this rule in the spec. Thanks for asking your question and helping to simplify things further! Happy Holidays!, John [1] http://www.w3.org/TR/exi/#addingProductions > On Dec 22, 2015, at 5:54 PM, Alessandro Triglia <alessandrot60@live.com> wrote: > > John, > > Could you please clarify the part of the sentence, “... or the current element grammar contains a production of the form LeftHandSide : EE with event code of length 1”? > > The case of strict=false is clear to me. I am looking at the case of strict=true and I don’t understand how the above condition addresses this case. Does the sentence contain an implicit condition that LeftHandSide be the currentnonterminal (as opposed to the grammar just containing that production)? Also, could you explain why the EE event has to be of length 1 in order to produce those effects? > > I have read all the recent public emails on this subject and I agree with everything you wrote on this topic, so I am not questioning your proposal. My questions are about the wording. > > Thanks, > Alessandro > > > From: John Schneider [mailto:john.schneider@agiledelta.com <mailto:john.schneider@agiledelta.com>] > Sent: Wednesday, December 16, 2015 14:34 > To: Peintner, Daniel (ext) <daniel.peintner.ext@siemens.com <mailto:daniel.peintner.ext@siemens.com>> > Cc: Takuki Kamiya <tkamiya@us.fujitsu.com <mailto:tkamiya@us.fujitsu.com>>; public-exi@w3.org <mailto:public-exi@w3.org> > Subject: Re: Support for Canonical EXI interoperability test in TTFMS > > All, > > I’ve thought about this more and talked to others and I’m still a bit perplexed as to why we would take this approach. If I understand correctly, every time we get an empty value, we are going to spend some extra time to determine whether the associated DTR can represent the value less efficiently as CH(“”) EE rather than just EE. And every time we determine the DTR can represent the value less efficiently, we’re going to do it. To me, this does not seem like the best use of resources. > > The argument has been made that this approach is somehow more consistent than always encoding the empty value the most efficient way possible. However, this approach encodes empty values differently depending on their data types and encodes the same document differently with schemas vs. without schemas. So, as Taki has pointed out, it is inconsistent in different — and to me more perplexing — ways. > > The alternate proposal says always encode an empty value in the most efficient way possible. Only encode it differently if the most efficient way is not available. The original proposal says always encode the empty value in the least efficient way possible. Only encode it differently if the least efficient way is not available. And it requires implementations to spend extra time to check if the DTR has the ability to encode the value less efficiently. > > At the end of the day, its not clear to me why we would increase code complexity and use more resources in order to detect and use a less efficient encoding method. Once again, here is the alternate proposal: > > “When strict is false or the current element grammar contains a production of the form LeftHandSide : EE with event code of length 1, EXI can represent the content of an empty element explicitly as an empty CH event or implicitly as a SE event immediately followed by an EE event. In these circumstances, Canonical EXI MUST represent an empty element by a SE event followed by an EE event.” > > Again, I hope this alternate proposal is helpful and serves to improve the quality of the Canonical EXI specification. > > All the best!, > > John > > >> On Dec 16, 2015, at 1:48 AM, Peintner, Daniel (ext) <daniel.peintner.ext@siemens.com <mailto:daniel.peintner.ext@siemens.com>> wrote: >> >> Hi Taki, all, >> >> >> I think you raise a good point and I believe we can describe the 2nd condition more generally so that It also accounts for user-defined datatype representations. >> >> -------------- >> A canonical EXI processor MUST add a CH event with a String of length 0 (zero) >> >> * if processing the EE event of an XML Information item fails by means of existing >> event codes of length 1 (i.e., no EE exists), and >> >> * when processing a schema-informed grammar where a CH [schema-typed value] >> event code of length 1 exists that allows for a String of length 0 (zero) >> -------------- >> What do you think? >> >> Thanks, >> >> -- Daniel >> >> >> >> >> >> >> >> ________________________________ >> Von: Takuki Kamiya [tkamiya@us.fujitsu.com <mailto:tkamiya@us.fujitsu.com>] >> Gesendet: Montag, 7. Dezember 2015 21:02 >> An: Peintner, Daniel (ext); public-exi@w3.org <mailto:public-exi@w3.org> >> Betreff: RE: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Daniel, >> >> The issue was resolved in this version. >> >> Additionally, I wonder how we suggest implementations behave when the >> [schema-typed value] is associated with an user-defined datatype >> representation and it permits "". I think it makes sense for canonical EXI >> to be additionally prepared for it. >> >> The second condition will be common whether we decide to take one approach >> or another. Only the first condition will differ. >> >> In one approach, the condition is shown below as you described. >> >> In the other approach, the condition would be described as: >> >> * if processing the EE event of an XML Information item fails by means of any >> available productions (i.e., no EE exists), and >> >> Besides, there are two separate consistencies are at play here. >> >> One is consistency of token sequence across settings whether deviation >> is allowed or not. This is what the first approach is trying to observe. >> >> The other is consistency between Infoset item sequence and EXI token >> sequence. This is what the second approach adheres to, as long as it is possible. >> >> I think both consistencies are important. >> >> Thank you, >> >> Takuki Kamiya >> Fujitsu Laboratories of America >> >> >> >> -----Original Message----- >> From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com <mailto:daniel.peintner.ext@siemens.com>] >> Sent: Monday, December 07, 2015 2:37 AM >> To: Takuki Kamiya; public-exi@w3.org <mailto:public-exi@w3.org> >> Subject: AW: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Taki, >> >> Now I see your point and I think the issue you raised is what was the original plan. >> >> To make it precise I would suggest rephrasing condition #2. For readability I added the entire rule again. >> >> -------------- >> A canonical EXI processor MUST add a CH event with a String of length 0 (zero) >> >> * if processing the EE event of an XML Information item fails by means of existing >> event codes of length 1 (i.e., no EE exists), and >> >> * when processing a schema-informed grammar where a CH [schema-typed value] >> event code of length 1 exists with Built-in EXI Datatype Representation >> "Binary" (exi:base64Binary and exi:hexBinary), >> "String", "List" or an Enumeration with an empty item. >> >> -------------- >> >> Does this resolve the issue for you? >> >> Thanks, >> >> -- Daniel >> >> >> >> >> >> ________________________________ >> Von: Takuki Kamiya [tkamiya@us.fujitsu.com <mailto:tkamiya@us.fujitsu.com>] >> Gesendet: Samstag, 5. Dezember 2015 02:11 >> An: Peintner, Daniel (ext); public-exi@w3.org <mailto:public-exi@w3.org> >> Betreff: RE: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Daniel, >> >> Say you have the following instance and schema. >> >> Instance: >> <A></A> >> >> Schema: >> <xsd:complexType name="A" mixed="true"> >> <xsd:sequence> >> <xsd:element name="B"> >> <xsd:complexType/> >> </xsd:element> >> </xsd:sequence> >> </xsd:complexType> >> >> After <A> comes </A>, and at that point, available event types >> in the grammar are CH_[untyped value] and SE(B). This situation >> satisfies the two conditions you specified. Therefore, encoders >> at this point are required to insert empty CH event. However, >> since CH_[untyped value] is from mixed content type, encoders >> should not be required to insert empty CH event there. >> >> Thank you, >> >> Takuki Kamiya >> Fujitsu Laboratories of America >> >> >> -----Original Message----- >> From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com <mailto:daniel.peintner.ext@siemens.com>] >> Sent: Friday, December 04, 2015 4:48 AM >> To: Takuki Kamiya; public-exi@w3.org <mailto:public-exi@w3.org> >> Subject: AW: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Taki, >> >> I am not sure if I understand the issue. >> >> Can you provide an example? >> >> Thanks, >> >> >> -- Daniel >> >> >> >> ________________________________ >> Von: Takuki Kamiya [tkamiya@us.fujitsu.com <mailto:tkamiya@us.fujitsu.com>] >> Gesendet: Donnerstag, 3. Dezember 2015 20:51 >> An: Peintner, Daniel (ext); public-exi@w3.org <mailto:public-exi@w3.org> >> Betreff: RE: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Daniel, >> >> Since characters matching CH_[untyped value] are represented using String >> codec, the second bullet description still appear to include cases of >> mixed content characters. >> >> Right? >> >> >> Takuki Kamiya >> Fujitsu Laboratories of America >> >> >> -----Original Message----- >> From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com <mailto:daniel.peintner.ext@siemens.com>] >> Sent: Thursday, December 03, 2015 7:54 AM >> To: Takuki Kamiya; public-exi@w3.org <mailto:public-exi@w3.org> >> Subject: AW: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Taki, >> >> >>>> * if processing the current XML Information item fails by means of existing >>>> event codes of length 1 (i.e., no EE or SE event exists), and >>> >>> Does this include a situation in which you are trying to encode an infoset SE >>> event, but the current grammar does not contain production of SE event type? >> >> The "no SE events exists" in parenthesis is wrong and should be excluded. >> >> I just modified the first part of the rule (but added both parts again for readability) >> >> -------------- >> A canonical EXI processor MUST add a CH event with a String of length 0 (zero) >> >> * if processing the EE event of an XML Information item fails by means of existing >> event codes of length 1 (i.e., no EE exists), and >> >> * when processing a schema-informed grammar where a CH event code of length 1 exists with >> Built-in EXI Datatype Representation "Binary" (exi:base64Binary and exi:hexBinary), >> "String", "List" or an Enumeration with an empty item. >> >> -------------- >> >> Hope this clarifies your point! >> Thank you for your question. >> >> Thanks, >> >> -- Daniel >> >> >> >> >> >> ________________________________ >> Von: Takuki Kamiya [tkamiya@us.fujitsu.com <mailto:tkamiya@us.fujitsu.com>] >> Gesendet: Freitag, 20. November 2015 00:58 >> An: Peintner, Daniel (ext); public-exi@w3.org <mailto:public-exi@w3.org> >> Betreff: RE: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Daniel, >> >> >>> * if processing the current XML Information item fails by means of existing >>> event codes of length 1 (i.e., no EE or SE event exists), and >> >> Does this include a situation in which you are trying to encode an infoset SE >> event, but the current grammar does not contain production of SE event type? >> >> I am not sure if you want to insert empty CH at that point. Does doing so >> help the process of encoding? >> >> Thank you, >> >> Takuki Kamiya >> Fujitsu Laboratories of America >> >> >> -----Original Message----- >> From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com <mailto:daniel.peintner.ext@siemens.com>] >> Sent: Wednesday, November 18, 2015 8:39 AM >> To: Takuki Kamiya; public-exi@w3.org <mailto:public-exi@w3.org> >> Subject: AW: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Taki, all, >> >> Thank you for your reply and your valuable comments. >> >> I updated the proposal to incorporate your feedback. Also, the description now states the intent and lists again the rules. >> >> >> ---> >> >> In general, Canonical EXI MUST NOT change the sequence of XML information items. However, the XML Infoset in some rare cases (e.g., due to API characteristics) may miss "Character Information items" such as strings with the number of characters equal to 0 (zero). EXI encoding may also fail without such an "empty" character information item (e.g., strict schema-informed streams that state the requirement of an expected character string - even if empty). >> >> Hence, Canonical EXI aims for adding an "empty" character information item if the intent requires to do so (e.g., expected character string) and not for any other use case (e.g., mixed content). >> >> That said, a canonical EXI processor MUST add a CH event with a String of length 0 (zero) >> * if processing the current XML Information item fails by means of existing event codes >> of length 1 (i.e., no EE or SE event exists), and >> * when processing a schema-informed grammar where a CH event code of length 1 exists with >> Built-in EXI Datatype Representation "Binary" (exi:base64Binary and exi:hexBinary), >> "String", "List" or an Enumeration with an empty item. >> >> In all other cases no further events MUST be added. >> <--- >> >> What do you think? >> Do you have any updates/proposals? >> >> Thanks, >> >> -- Daniel >> >> >> >> ________________________________ >> Von: Takuki Kamiya [tkamiya@us.fujitsu.com <mailto:tkamiya@us.fujitsu.com>] >> Gesendet: Mittwoch, 11. November 2015 22:40 >> An: Peintner, Daniel (ext); public-exi@w3.org <mailto:public-exi@w3.org> >> Betreff: RE: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Daniel, >> >> In schema-informed context, CH event-type with event-code length 1 comes from >> two different schema constructs. One is from simple type content, the other is >> from mixed-content. >> >> For CH event types that came from mixed-content, there is no need for inserting >> empty CH event. Therefore, I would suggest to exclude mixed-content CH event >> types from the rule you described below. >> >> You listed three EXI datatype representations (i.e. Binary, String and List) as >> applicable to the described empty CH event insertion rule. I would like to point >> out that enumerated values where one of the values is an empty string (i.e. "") >> also should also apply. In other words, in all context where the EXI datatype >> representation associated with the current CH event type allows for an empty CH, >> empty CH event should be inserted. >> >> Thanks, >> >> taki >> >> >> -----Original Message----- >> From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com <mailto:daniel.peintner.ext@siemens.com>] >> Sent: Wednesday, November 11, 2015 5:08 AM >> To: Takuki Kamiya; public-exi@w3.org <mailto:public-exi@w3.org> >> Subject: AW: Support for Canonical EXI interoperability test in TTFMS >> >> All, >> >> According to yesterday's telecon I explored the empty CH("") event a bit further. >> >> There are various situations when an empty CH could be added. One rather obvious case is a schema-informed stream that states the requirements of an expected character string (even if the string is empty). However, also in schema-less mode one could assume that a previously "learned" CH event could mean that a CH is expected even if it is not there... >> >> Summarizing I would like to propose the following requirement/addition to the Canonical EXI document. >> >> ---> >> The XML Infoset in some rare cases (e.g., due to API characteristics) may miss "Character Information items" such as strings with the number of characters equal to 0 (zero). That said, EXI encoding may also fail without such an "empty" character information item. Hence, a canonical EXI processor MUST add a CH event with a String of length 0 (zero), if not already there, when beeing in a schema-informed grammar where a CH event code of length 1 exists with Built-in EXI Datatype Representation "Binary" (exi:base64Binary and exi:hexBinary), "String" or "List". The availability of such a CH event in the grammar clearly states the intent, in this case the requirement of empty characters. In all other cases no further events MUST be added. >> <--- >> >> What do people think? >> >> Thanks, >> >> -- Daniel >> >> >> >> >> >> ________________________________ >> Von: Peintner, Daniel (ext) [daniel.peintner.ext@siemens.com <mailto:daniel.peintner.ext@siemens.com>] >> Gesendet: Montag, 9. November 2015 17:06 >> An: Takuki Kamiya; public-exi@w3.org <mailto:public-exi@w3.org> >> Betreff: AW: Support for Canonical EXI interoperability test in TTFMS >> >> Taki, all, >> >> we looked into the issue more closely and found the following issues. >> >> 1. How to deal with conflicting framework options >> >> The framework (or the associated test cases) may define conflicting parameters (e.g, preserve processing instructions and strict). In such a situation an EXI processor may decide whether to use non-strict encoding to support processing instructions or to eliminate PI support. >> >> As it turns out the EXI processors (OpenEXI and EXIficient) tend to use different strategies. That said, both strategies are OK. Hence, I think we need to make the framework aware of such a situation so that the framework decides what is the desired result. >> >> >> 2. Empty CH("") events >> >> An XML schema may define an element as follows >> <xs:element name="foo" type="xs:string"/> >> >> A valid instance may look as follows. >> >> <foo></foo> >> >> Depending on the EXI options and the mode (strict vs. non-strict) the following two EXI streams are possible >> >> SE(foo) EE(foo) --> applicable in non-strict only >> SE(foo) CH("") EE(foo) --> applicable in strict and non-strict >> >> Again, we need to ensure all Canonical EXI processors behave the same. >> Hence, I would argue for the latter case given that it is usable in both (strict and non-strict) scenario but I am open for other ideas/thoughts. >> >> 3. Whitespace handling >> >> I wonder whether we need to define whitespace preservation rules in Canonical EXI similar to the TTFMS framework rules. >> >> >> Thanks, >> >> -- Daniel >> >> >> >> >> >> ________________________________ >> Von: Takuki Kamiya [tkamiya@us.fujitsu.com <mailto:tkamiya@us.fujitsu.com>] >> Gesendet: Mittwoch, 21. Oktober 2015 00:05 >> An: Peintner, Daniel (ext); public-exi@w3.org <mailto:public-exi@w3.org> >> Betreff: RE: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Daniel, >> >> I fixed a bug in the TTFMS framework. >> >> Next time you compile the framework and run the test, >> you will be able to see schema-informed EXI files generated >> when the test case provides one and schema use is enabled. >> >> Thank you, >> >> Takuki Kamiya >> Fujitsu Laboratories of America >> >> >> -----Original Message----- >> From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com <mailto:daniel.peintner.ext@siemens.com>] >> Sent: Wednesday, October 14, 2015 6:50 AM >> To: Takuki Kamiya; public-exi@w3.org <mailto:public-exi@w3.org> >> Subject: AW: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Taki, >> >> I uploaded a revised EXIficient library but I agree, I do still see some issues. >> (in my test run 20 files out of 115 are still different) >> >> Maybe this has to do with whitespace handling (will send separate email...) >> >> Moreover, I am currently able to run schema-less test runs only by calling >> ant run-iot-c14n-classes -DtestCases=config/testCases-restricted/all-v1.xml >> >> Maybe someone can point me to the configuration how to call schema-informed test runs or byteAligned test runs to facilitate debugging. >> >> Thanks, >> >> -- Daniel >> >> >> >> ________________________________ >> Von: Takuki Kamiya [tkamiya@us.fujitsu.com <mailto:tkamiya@us.fujitsu.com>] >> Gesendet: Dienstag, 13. Oktober 2015 03:23 >> An: Peintner, Daniel (ext); public-exi@w3.org <mailto:public-exi@w3.org> >> Betreff: RE: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Daniel, >> >> I also modified openexi driver so that it always output header options. >> >> However, I still see many differences between exificient and openexi >> outputs. We will need to further investigate this. >> >> Thank you, >> >> Takuki Kamiya >> Fujitsu Laboratories of America >> >> >> -----Original Message----- >> From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com <mailto:daniel.peintner.ext@siemens.com>] >> Sent: Thursday, October 01, 2015 5:50 AM >> To: Takuki Kamiya; public-exi@w3.org <mailto:public-exi@w3.org> >> Subject: AW: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Taki, >> >> Thank you for pointing me to the parameter "measure" which indicates the type of the test run. >> >> I also uploaded a first snapshot of the EXIficient library supporting Canonical EXI. Additional updates may be necessary. >> When comparing the encoded files with OpenEXI I do see mostly diffs. I think it is because OpenEXI at the moment does not always include the EXI Options. >> >> Please let me know if you encounter other issues. >> >> Thanks, >> >> -- Daniel >> >> ________________________________ >> Von: Takuki Kamiya [tkamiya@us.fujitsu.com <mailto:tkamiya@us.fujitsu.com>] >> Gesendet: Donnerstag, 1. Oktober 2015 01:45 >> An: Peintner, Daniel (ext); public-exi@w3.org <mailto:public-exi@w3.org> >> Betreff: RE: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Daniel, >> >> You should be able to get the test mode by accessing: >> measure field (of class MeasureParam) that is in _driverParams (of class DriverParameters) >> >> When it is iot_c14n_encode, you should change the behavior of the >> processor to comply with c14n rules. >> >> Do you plan to check-in new EXIficient jar to TTFMS soon? >> >> Thank you, >> >> Takuki Kamiya >> Fujitsu Laboratories of America >> >> >> -----Original Message----- >> From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com <mailto:daniel.peintner.ext@siemens.com>] >> Sent: Wednesday, September 30, 2015 8:28 AM >> To: Takuki Kamiya; public-exi@w3.org <mailto:public-exi@w3.org> >> Subject: AW: Support for Canonical EXI interoperability test in TTFMS >> >> Hi Taki, >> >> I did check out the new code and it worked as expected. >> Thank you for your work! >> >> The only thing I miss is a testCase option that informs about whether the EXI processor is required to produce canonical EXI. >> >> Did I miss anything with that regard? >> >> Thanks, >> >> -- Daniel >> >> >> >> P.S. EXIficient does not sort attributes in schema-less mode >> >> >> ________________________________ >> Von: Takuki Kamiya [tkamiya@us.fujitsu.com <mailto:tkamiya@us.fujitsu.com>] >> Gesendet: Dienstag, 29. September 2015 02:20 >> An: public-exi@w3.org <mailto:public-exi@w3.org> >> Betreff: Support for Canonical EXI interoperability test in TTFMS >> >> Hi, >> >> I added support for Canonical EXI interoperability test in TTFMS. >> >> You need to invoke target " run-iot-c14n-classes" in order to run the >> encoding process. >> >> After that, diff tools such as WinMerge (on windows) can be used to >> compare the encoded files output by various implementations. >> >> Initial experimental run showed quite a lot of differences in encodings >> between EXIficient and OpenEXI. >> >> I found at least some of the diffs are due to the attribute orders in >> schema-less setting. Is it true that EXIficient sorts attributes whether >> it is schema-less or schema-informed? >> >> Thank you, >> >> Takuki Kamiya >> Fujitsu Laboratories of America >> >> > > > AgileDelta, Inc. > john.schneider@agiledelta.com <mailto:john.schneider@agiledelta.com> > http://www.agiledelta.com <http://www.agiledelta.com/> > w: 425-644-7122 > m: 425-503-3403 > f: 425-644-7126 AgileDelta, Inc. john.schneider@agiledelta.com http://www.agiledelta.com w: 425-644-7122 m: 425-503-3403 f: 425-644-7126
Received on Wednesday, 23 December 2015 17:26:01 UTC