- From: Takuki Kamiya <tkamiya@us.fujitsu.com>
- Date: Mon, 4 Feb 2013 14:06:40 -0800
- To: "public-exi@w3.org" <public-exi@w3.org>
Hi,
The EXI WG recently found that certain complex types result in, when they are
fully fledged using the process described in section 8.5.4.4.1, grammars that
are slightly different from those that are generated by actual EXI 1.0
implementations.
The discrepancy was first discovered for the ur-type grammar, however, the
condition under which the issue becomes evident was identified to be
potentially all complex types that have an attribute wildcard.
The WG analyzed the finding, and agreed that the implentations are actually
generating grammars that had originally been expected by the WG to be the
effect of what was described in the spec. Therefore, the WG sees it as an
erratum of the specification, and would like to fix the error.
Technical details regarding the issue, and the way the EXI Working Group intends
to fix the bug is thoroughly described below.
Please let us know if there are any comments or questions as soon as
possible.
Sincerely,
Daniel Peintner, Takuki Kamiya
for the EXI Working Group
-------------------------------------------------------------------------------
Section 8.5.4.1.3.3 "Complex Ur-Type Grammar" prescribes a grammar for
ur-type as follows.
Type_ur-type,0 :
AT (*) Type_ur-type,0
SE(*) Type_ur-type,1
EE
CH Type ur-type, 1
Type_ur-type,1 :
SE(*) Type_ur-type,1
EE
CH Type_ur-type,1
The last paragraph of the section also says that the content index used for
the above grammar is 1.
Shown below is the grammar you can get by applying the rule described in
8.5.4.4.1 "Adding Productions when Strict is False" to the above grammar,
assuming the default preservation option (i.e. no PI, no CM, etc.).
Type_ur-type,0 :
AT (*) Type_ur-type,0
SE(*) Type_ur-type,1
EE
CH Type ur-type,1
AT(xsi:type) Type_ur-type,0
AT(xsi:nil) Type_ur-type,0
AT(*) Type_ur-type,0
AT(*) [untyped value] Type_ur-type,0
SE(*) Type_ur-type,1copy
CH [untyped value] Type_ur-type,1copy
Type_ur-type,1 :
SE(*) Type_ur-type,1
EE
CH Type_ur-type,1
AT(*) Type_ur-type,1
AT(*) [untyped value] Type_ur-type,1
SE(*) Type_ur-type,1copy
CH [untyped value] Type_ur-type,1copy
Type_ur-type,1copy :
SE(*) Type_ur-type,1copy
EE
CH Type_ur-type,1copy
SE(*) Type_ur-type,1copy
CH [untyped value] Type_ur-type,1copy
Say you have a XML of the form:
<A>abc</A>
where the element A is typed xsd:anyType in the schema.
According to the above grammar, upon characters "abc", the state moves
from Type_ur-type,0 to Type ur-type,1 if the 4th production in Type_ur-type,0
is used.
Because "abc" is a character data in an element, at that point you will never see
attributes for the element. However, the grammar Type_ur-type,1 where the state
is now at, can still accept attributes. This is not what was intended to be the effect
of what was described in section 8.5.4.1.3.3. The specification needs to be
repaired to address this discrepancy between the current description and the
effect intended that the description was supposed to have led us to.
Given that the spec is already clear in Section "8.5.4.1.3 Type Grammars" how
grammars are build the WG is inclined to clarify it as follows:
* Remove Section "8.5.4.1.3.3 Complex Ur-Type Grammar" and references entirely
* Change the the fourth para in Section "8.5.4.1.3 Type Grammars" from
"Sections 8.5.4.1.3.1 Simple Type Grammars and 8.5.4.1.3.2 Complex Type Grammars
describe the processes for creating Type i and TypeEmpty i from XML Schema simple
type definitionsXS1 and complex type definitionsXS1 defined in schemas as well as
built-in primitive typesXS2, built-in derived typesXS2 and simple ur-typeXS2 defined by
XML Schema specification [XML Schema Datatypes]. Section 8.5.4.1.3.3 Complex Ur-Type
Grammar defines the grammar used for processing instances of element contents of
type xsd:anyTypeXS1. "
to
"Sections 8.5.4.1.3.1 Simple Type Grammars and 8.5.4.1.3.2 Complex Type Grammars
describe the processes for creating Type i and TypeEmpty i from XML Schema simple
type definitionsXS1 and complex type definitionsXS1 defined in schemas as well as
built-in primitive typesXS2, built-in derived typesXS2, simple ur-typeXS2 and complex
ur-typeXS2 defined by XML Schema specification [XML Schema Datatypes].
In addition, Section "8.5.4.1.3.2 Complex Type Grammars" needs the following change
to correctly support the handling of those cases such as the one described as an
example in the problem statement above.
Currently, if an {attribute wildcard} is specified, an extra attribute use grammar
G_n-1 of the following form is added.
G_n−1,0 :
EE
Change the form of the above grammar to:
G_n−1,0 :
EE
G_n−1,1 :
EE
Just before the first note in the section, add the following rule.
If there is neither an attribute use nor an {attribute wildcard}, G_0 of the following form
is used as an attribute use grammar.
G_0,0 :
EE
-------------------------------------------------------------------------------
END
Received on Monday, 4 February 2013 22:07:19 UTC