Re: Question on "Adding Productions when Strict is False"

Hi Taki,

Thank you for the explanation!
As simple it is, I really missed the point in the description for
adding AT(qname-n) [untyped value] and the AT(*) [untyped value]
productions. I was wrongly assuming that an event code with three
parts has a highest value for its second part and first part. This is
valid for the second part and for all other grammars and even in the
introductory examples in 6.2. Well, basically everywhere in the
specification except for this particular case.

I was using this for making some shortcuts in the implementation and
was so convinced that this is the case for all codes that I totally
did not pay attention to the indexes that clearly indicate that AT(*)
[untyped value]  has code 1.3.0 in this case.

Thank you again!

Best regards,
Rumen

On Thu, Nov 15, 2012 at 7:58 PM, Takuki Kamiya <tkamiya@us.fujitsu.com> wrote:
> Hi Rumen,
>
> Looking at the grammar test-0 that you presented, there is one production
> missing just before the one for NS.
>
> The correct grammar should look like the following.
>
> test-0:
>            EE                                0
>            AT(xsi:type)  test-0              1.0
>            AT(xsi:nil)  test-0               1.1
>            AT(*)  test-0                     1.2
>            AT(*) [untyped value]  test-0     1.3.0
>            NS  test-0                        1.4
>            SE(*)  content2                   1.5
>            CH [untyped value]  content2      1.6
>
> Section "8.5.4.4.1 Adding Productions when Strict is False" requires that
> you always add AT(*) [untyped value] in tandem with AT(*).
>
> Given the above grammar, the event code for the event type NS ought
> to be 1.4.
>
> So I think it means EXIficient and OpenEXI interoperates in the right way.
>
> Regards,
>
> taki
>
>
>
> -----Original Message-----
> From: Rumen Kyusakov [mailto:kjussakov@gmail.com]
> Sent: Thursday, November 15, 2012 7:15 AM
> To: public-exi@w3.org
> Subject: Re: Question on "Adding Productions when Strict is False"
>
> Hi Daniel,
>
> Great thanks for your response. I'm not sure if this mailing list is
> the right place to discuss this and I wouldn't if I wasn't convinced
> there is some issue around that.
> I understand that each implementation has its own strategy to handle
> the grammar representation - my issue is that it seems to me that
> EXIficient picks the wrong event code in a certain case because of
> that "AT(invalid)" production. I know I must be wrong because the
> OpenEXI is able to decode the EXI stream produced by EXIficient - I
> just don't understand why although I went through the spec many times.
> Please consider the following very simple use case:
> You have xml:
> <?xml version="1.0" encoding="UTF-8"?>
> <test xmlns="http://ns-test"/>
>
> and schema:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
> targetNamespace="http://ns-test" elementFormDefault="qualified">
> <xs:element name="test">
>   <xs:complexType/>
> </xs:element>
> </xs:schema>
>
> During schema-enabled encoding with Preserve.prefixes (all the rest of
> the options have default values):
> EXIficient encodes the NS event with event code 1.4
>
> ....
> encoder.encodeAttributeList(exiAttributes); // SAXEncoder, line 127
> currentRule.get2ndLevelEventCode(EventType.NAMESPACE_DECLARATION,
> fidelityOptions); // AbstractEXIBodyEncoder, line 360
> encode2ndLevelEventCode(ec2); // AbstractEXIBodyEncoder, line 363
> ....
>
> Looking at the spec the grammar and event codes should look like this:
>
> test-0:
>            EE                                      0
>            AT(xsi:type)  test-0              1.0
>            AT(xsi:nil)  test-0                 1.1
>            AT (*)  test-0                       1.2
>            NS  test-0                           1.3
>            SE (*) content2                    1.4
>            CH [untyped value] content2 1.5
>
> The NS event should be encoded with 1.3 according to the spec.
>
> Where am I going wrong?
>
> Thanks in advance!
>
> Best regards,
> Rumen
>
>
> On Thu, Nov 15, 2012 at 9:14 AM, Peintner, Daniel (ext)
> <daniel.peintner.ext@siemens.com> wrote:
>> Hi Rumen,
>>
>>> Element-i,0 : AT (*) [untyped] Element-i,0 n.m+2)
>>>
>>> It is inserted before "NS Element-i,0   next n.m" production and after
>>> the "AT (*) Element-i,0    next n.m"
>>>
>>> According to my understanding of the specification this productions
>>> has three parts [...]
>>
>> Your understanding is correct.
>>
>>> However, when looking at the EXIficient implementation:
>>> in the SchemaInformedFirstStartTag class, methods getNumberOf2ndLevelEvents()
>>> and get2ndLevelEventCode(), the code includes one more production with
>>> even code with 2 parts:
>>>
>>> In the source code this extra production is referred in the comments
>>> as "AT(invalid)."
>>
>> Like in most specifications there are different strategies to actually implement a certain behaviour. This is also the case for the source code you cited.
>>
>> EXIficient creates an event on the second level that links to the available events on the third level. This second level event code part is never meant to be encoded without a subsequently following third level event code part.
>>
>> However, everyone is free to choose another strategy as long as the result matches the specification.
>>
>> Hope this helps,
>>
>> -- Daniel
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Rumen Kyusakov [mailto:kjussakov@gmail.com]
>> Sent: Wednesday, October 24, 2012 11:27 AM
>> To: public-exi@w3.org
>> Subject: Question on "Adding Productions when Strict is False"
>>
>> Dear all,
>>
>> I have a question regarding encoding of event codes in schema mode when strict is FALSE.
>> According to my understanding of
>> http://www.w3.org/TR/2011/REC-exi-20110310/#addingProductions the second level productions (the productions with even codes with 2
>> parts)
>> for the first grammar rule of schema-derived grammars are:
>>
>> Element-i,0 : EE                                          n.m // Only
>> if not available already with shorter event code
>>                    AT(xsi:type) Element-i,0     next n.m
>>                    AT(xsi:nil) Element-i,0        next n.m
>>                    AT (*) Element-i,0              next n.m
>>                    NS Element-i,0                  next n.m // If NS
>> are preserved
>>                    SC Fragment                     next n.m // If SC
>> are preserved
>>                    SE (*) Element-i,c2            next n.m
>>                    CH [untyped] Element-i,c2  next n.m
>>                    ER Element-i,c2                next n.m // If ER
>> are preserved
>>
>> However, when looking at the EXIficient implementation:
>> in the SchemaInformedFirstStartTag class, methods getNumberOf2ndLevelEvents() and get2ndLevelEventCode(), the code includes one more production with even code with 2 parts:
>>
>> Element-i,0 : AT (*) [untyped] Element-i,0 n.m+2)
>>
>> It is inserted before "NS Element-i,0   next n.m" production and after
>> the "AT (*) Element-i,0    next n.m"
>> In the source code this extra production is referred in the comments as "AT(invalid)."
>>
>> According to my understanding of the specification this productions has three parts event code that is defined by the following fragment from the spec:
>>
>> For each non-terminal Element i, j , such that 0 ≤ j ≤ content , with zero or more productions of the following form:
>>
>> Element i, j :
>>         AT (qname 0 ) [schema-typed value] NonTerminal 0
>>         AT (qname 1 ) [schema-typed value] NonTerminal 1
>>             ⋮
>>         AT (qname x-1 ) [schema-typed value] NonTerminal x-1 where x represents the number of attributes declared in the schema for this context, add the following productions:
>>
>>
>> Element i, j :
>>         AT (*) Element i, j n.m
>>         AT (qname 0 ) [untyped value] NonTerminal 0 n.(m+1).0
>>         AT (qname 1 ) [untyped value] NonTerminal 1 n.(m+1).1
>>             ⋮     ⋮
>>         AT (qname x-1 ) [untyped value] NonTerminal x-1 n.(m+1).(x-1)
>>         AT (*) [untyped value] Element i, j n.(m+1).(x)
>>
>> where n.m represents the next available event code with length 2.
>>
>> The last production "AT (*) [untyped value] Element i, j n.(m+1).(x)" has three parts and not 2.
>>
>> Some test with OpenEXI showed that the same extra production "AT (*) [untyped]" with even code with two parts is used as well.
>>
>> Can someone give me pointers on why we have this extra production?
>>
>> Best Regards,
>> Rumen
>>
>>
>

Received on Friday, 16 November 2012 10:44:53 UTC