- From: Takuki Kamiya <tkamiya@us.fujitsu.com>
- Date: Mon, 4 Feb 2013 14:06:40 -0800
- To: "public-exi@w3.org" <public-exi@w3.org>
Hi, The EXI WG recently found that certain complex types result in, when they are fully fledged using the process described in section 8.5.4.4.1, grammars that are slightly different from those that are generated by actual EXI 1.0 implementations. The discrepancy was first discovered for the ur-type grammar, however, the condition under which the issue becomes evident was identified to be potentially all complex types that have an attribute wildcard. The WG analyzed the finding, and agreed that the implentations are actually generating grammars that had originally been expected by the WG to be the effect of what was described in the spec. Therefore, the WG sees it as an erratum of the specification, and would like to fix the error. Technical details regarding the issue, and the way the EXI Working Group intends to fix the bug is thoroughly described below. Please let us know if there are any comments or questions as soon as possible. Sincerely, Daniel Peintner, Takuki Kamiya for the EXI Working Group ------------------------------------------------------------------------------- Section 8.5.4.1.3.3 "Complex Ur-Type Grammar" prescribes a grammar for ur-type as follows. Type_ur-type,0 : AT (*) Type_ur-type,0 SE(*) Type_ur-type,1 EE CH Type ur-type, 1 Type_ur-type,1 : SE(*) Type_ur-type,1 EE CH Type_ur-type,1 The last paragraph of the section also says that the content index used for the above grammar is 1. Shown below is the grammar you can get by applying the rule described in 8.5.4.4.1 "Adding Productions when Strict is False" to the above grammar, assuming the default preservation option (i.e. no PI, no CM, etc.). Type_ur-type,0 : AT (*) Type_ur-type,0 SE(*) Type_ur-type,1 EE CH Type ur-type,1 AT(xsi:type) Type_ur-type,0 AT(xsi:nil) Type_ur-type,0 AT(*) Type_ur-type,0 AT(*) [untyped value] Type_ur-type,0 SE(*) Type_ur-type,1copy CH [untyped value] Type_ur-type,1copy Type_ur-type,1 : SE(*) Type_ur-type,1 EE CH Type_ur-type,1 AT(*) Type_ur-type,1 AT(*) [untyped value] Type_ur-type,1 SE(*) Type_ur-type,1copy CH [untyped value] Type_ur-type,1copy Type_ur-type,1copy : SE(*) Type_ur-type,1copy EE CH Type_ur-type,1copy SE(*) Type_ur-type,1copy CH [untyped value] Type_ur-type,1copy Say you have a XML of the form: <A>abc</A> where the element A is typed xsd:anyType in the schema. According to the above grammar, upon characters "abc", the state moves from Type_ur-type,0 to Type ur-type,1 if the 4th production in Type_ur-type,0 is used. Because "abc" is a character data in an element, at that point you will never see attributes for the element. However, the grammar Type_ur-type,1 where the state is now at, can still accept attributes. This is not what was intended to be the effect of what was described in section 8.5.4.1.3.3. The specification needs to be repaired to address this discrepancy between the current description and the effect intended that the description was supposed to have led us to. Given that the spec is already clear in Section "8.5.4.1.3 Type Grammars" how grammars are build the WG is inclined to clarify it as follows: * Remove Section "8.5.4.1.3.3 Complex Ur-Type Grammar" and references entirely * Change the the fourth para in Section "8.5.4.1.3 Type Grammars" from "Sections 8.5.4.1.3.1 Simple Type Grammars and 8.5.4.1.3.2 Complex Type Grammars describe the processes for creating Type i and TypeEmpty i from XML Schema simple type definitionsXS1 and complex type definitionsXS1 defined in schemas as well as built-in primitive typesXS2, built-in derived typesXS2 and simple ur-typeXS2 defined by XML Schema specification [XML Schema Datatypes]. Section 8.5.4.1.3.3 Complex Ur-Type Grammar defines the grammar used for processing instances of element contents of type xsd:anyTypeXS1. " to "Sections 8.5.4.1.3.1 Simple Type Grammars and 8.5.4.1.3.2 Complex Type Grammars describe the processes for creating Type i and TypeEmpty i from XML Schema simple type definitionsXS1 and complex type definitionsXS1 defined in schemas as well as built-in primitive typesXS2, built-in derived typesXS2, simple ur-typeXS2 and complex ur-typeXS2 defined by XML Schema specification [XML Schema Datatypes]. In addition, Section "8.5.4.1.3.2 Complex Type Grammars" needs the following change to correctly support the handling of those cases such as the one described as an example in the problem statement above. Currently, if an {attribute wildcard} is specified, an extra attribute use grammar G_n-1 of the following form is added. G_n−1,0 : EE Change the form of the above grammar to: G_n−1,0 : EE G_n−1,1 : EE Just before the first note in the section, add the following rule. If there is neither an attribute use nor an {attribute wildcard}, G_0 of the following form is used as an attribute use grammar. G_0,0 : EE ------------------------------------------------------------------------------- END
Received on Monday, 4 February 2013 22:07:19 UTC