RE: Universality of strict mode

Hi Antoine,

Thanks for your comment. The WG has reviewed your comment, and noted
that you appear to have touched on some rather esoteric aspects of the format
on which the WG had made deliberate decisions. The outcomes of those
decisions had been reflected to the version of the specification that you have
reviewed and commented on. The WG made vigorous efforts to seek the best
path when there were pros and cons in all approaches that had been discussed.

The strict schema-informed grammar was conceived to help best facilitate
EXI's ability to serve data exchanges involving extremely resource-deprived
devices including sensors and commodity mobile handsets. In a few
circumstances in the process of designing the strict grammar, the WG
decided to shed off certain features that are not very common in use cases
involving those least advantaged devices in return for maximizing EXI's reach
into the growing legion of such devices when there are alternative (sometimes
even better) ways to achieve the same effect.

The WG compared the benefit of more rigorous support for QName values when
strict options is on with the burden it would incur on to implementations that
run on those devices. The primary concerns were in what it would implicate
in terms of implementation complexity. Even in such a case that a particular
implementation finds the amount of extra code for its support tolerable
(which are most likely to be the ones that run on lavish devices such as PCs),
that assertion appeared both too trecherous and indiscriminate to be general
across broad range of implementations and devices.

The WG also appreciates your suggestion to modify the QName value
representation when strict is on. We reviewed the suggested approach, and
found it is difficult to make a change in the way suggested when we considered
that it would incur added implementation complexity to all implementation
because it would be a behaviour made special for cases when strict option is
on, in addition to the currently stipulated behaviour. Also, there was noted
a concern that there is an efficiency issue that comes from the need for
buffering events that precede those attributes (AT) and text (CH) events
laden with QName-typed values when decoding EXI streams not preserving
prefixes into proper sequence of events for commonly used API events such as
SAX or StAX in order to be able to insert appropriate namespace declaration
events (i.e. startPrefixMapping event) upfront in processing elements. This
expected burden was considered to outweigh the benefits especially when there
are often alternative much more efficient mechanisms to QName values for use
of identifying things, including the use of numeric identifiers that would be
more amenable to those types of devices.

The WG had discussed the implication of having xsi:nil ahead of xsi:type.
It was also found to be a pros and cons situation. It was decided that while
there is a beauty in that approach, there was a concern that it might
bring a bit complexity required to interpret the two attributes together
(i.e. the final destination grammar cannot be determined before taking the
two together), and that complexity might not be trivial for some class of
implementations such as ones that are compiled into executables given schemas.
This decision was also informed by the observation that not supporting xsi:nil
and xsi:type together would not be critical for strict mode, which as you
indicated, seems to be attuned by both of us already.

We hope the above explanation articulates the background of the decisions,
and you understand the situation the WG found itself in when we made those
decisions where the theoretical ideal and the reality were not as compatible
as it initially appeared.

Best regards,

-taki


-----Original Message-----
From: public-exi-comments-request@w3.org [mailto:public-exi-comments-request@w3.org] On Behalf Of Antoine Mensch
Sent: Friday, July 02, 2010 3:05 AM
To: public-exi-comments@w3.org
Subject: Universality of strict mode

Dear colleagues,

in the current version of the spec, the strict mode supports the
encoding and decoding of almost all schema-valid documents, except the
following ones (note that there might be other cases not identified here):
* Documents containing QNames as values, as the strict mode disallows
the use of preserve.prefixes and the current representation of QNames
without prefixes is meaningless.
* Documents containing elements featuring both xsi:type and xsi:nil at
the same time, when the referenced type does not contain an attribute
wildcard.

The first case unfortunately corresponds to a large number of use cases,
that we have to support in small embedded devices:
* Exchange of static metadata information: for instance XML Schema or
WSDL documents.
* Exchange of more dynamic metadata information through messages: for
instance WS-Discovery and WS-MetadataExchange messages used in the OASIS
DPWS (Devices Profile for Web Services) specification.
* SOAP 1.2 fault codes and subcodes.

It seems that this could be fixed in the spec without creating backward
incompatibility (as the feature is currently not supported), in one of
two ways:
1. Support preserve.prefixes in strict mode: This would require the
addition of a production containing the NS event in the undeclared
productions for strict=true. It has the drawback of increasing the event
code size of the first production in all element grammars, of adding
unnecessary prefix declarations in the stream (besides the necessary
ones), and also of requiring the support of dynamic addition of URIs in
the URI table.
2. Use an alternative representation of QName values in strict mode: the
proposal is to encode both the URI and the local name as String values
(and not URI for the first part). This will prevent the required update
of the URI table, and allow control of caching using the maxCapacity and
maxLength options of the String value table. I understand that this
solution makes life a little bit more complex for implementers (as
appropriate prefix definitions may need to be inserted upstream in a
streaming API), but does not actually create additional complexity when
using a typed API or direct encoding/decoding from a data structure.
This second alternative is our preferred one.

The second case is obviously less critical, as it is not usual. However,
it seems that it could be fixed by simply requiring the xsi:nil
attribute to occur before the xsi:type one. In such a case, the
processor would know, when encountering the xsi:type attribute, whether
to select the corresponding Type or TypeEmpty grammar. It would also
have the side effect of simplifying the spec, by removing the case where
AT(*) can be matched by xsi:nil. This would however create a (small)
backward incompatibility with the current version of the spec.

Best regards

Antoine Mensch

Received on Wednesday, 15 September 2010 23:30:18 UTC