Re: Review of EXI from noah_mendelsohn@us.ibm.com on 2007-10-26 (www-tag@w3.org from October 2007)

From: <noah_mendelsohn@us.ibm.com>
Date: Fri, 26 Oct 2007 18:28:40 -0400
To: "David Orchard" <dorchard@bea.com>
Cc: "W3C-TAG Group WG" <www-tag@w3.org>, shh@us.ibm.com, haggar@us.ibm.com, klawrenc@us.ibm.com
Message-ID: <OFA346B2D6.8BF3077E-ON85257380.007984B5-85257380.007B4728@lotus.com>
Dave:

Speaking as a TAG member:
Thank you for doing this.  I think there's a lot of good stuff here.  My 
overall impression, FWIW, is that this would have more impact if we went 
into just a bit less detail, and focussed a slightly shorter note on some 
key points.  Don't take this is a significant criticism.  I think the core 
of this is very good, just that a bit of tightening would make it more 
effective.  If that means a few details fall out of it, I'm not sure 
that's all bad.

Speaking as an IBMer:

> - IBM's withdrawal [4] from the Working Group shows their detailed and 
specific disagreement with the benefits of EXI and remains unaddressed. 

Thank you for referencing this note; we're very glad that our concerns are 
viewed as significant for this discussion.  I do note that, like several 
other references from your note, the link is to member-confidential email. 
 I doubt that anyone in IBM would object to having a copy posted in a 
public archive, and if you'd like I could in principle check with my IBM 
colleagues.  I say in principle because our note also refers to some 
correspondence among a smaller group of individuals in the W3C.  It was 
originally written as input to the chairs and W3C staff, and I think we 
would need the permission of this individuals as well.  Shall we try and 
ask all these folks whether they're OK with at least IBM's note being 
posted in a public place?  If not, I think it will be frustrating for your 
readers to find the link and not be able to follow it.

> Just recently a working group member said "The same goes for 
> IBM, if they really thought the result tells something 
> important, they would have spent some more time establishing a 
> case out the result against EXI. "[5]

Again, this is a member only link.  I think you've taken a small enough 
bit that it's sort of out of context.  The quote appears to imply that 
surely IBM did not take its concerns about EXI seriously, as we would 
otherwise have "spent more time establishing the case...against EXI".  In 
fact, our work on XML performance was done over several years, and just 
the comparison work we did on EXI involved several weeks of work in 
particular.  Those comparisons were presented in a quite detailed 
presentation to the EXI workgroup, and I happened to be there as an 
observer that day.   I think it's only fair if you are going to have a 
quote like this that you ensure that the entire email at [5] is publicly 
accessible so that people can draw their own conclusions about it.

I think it's fair to say that IBM believes that EXI offers interesting 
compression on XML, and some speed gains in many use cases;  we also think 
that the speed gains over well optimized text implementations are not 
nearly as great as might be inferred from the measurements presented by 
the EXI group.  The issues include many that you've put into your note, 
Dave (choice and weighting of test cases, use of Java, etc.) 

More to the point, I think you're trying to make the case that the EXI 
workgroup didn't take our concerns seriously (I offer no opinion on that 
-- while our concerns have not been satisfied, I wouldn't want to accuse 
anyone in the EXI group of not taking them seriously).  In fact, the quote 
more directly seems to imply that we in IBM did not take our own concerns 
seriously, and that I must  object to (though the author is surely 
entitled to express that opinion).

Maybe it would be better to just point out that we were among those who 
raised concerns, possibly linking our note if we get it made public, and 
to indicate that we in IBM do not believe the performance analysis in the 
EXI drafts addresses our concerns.  If you want to additionally say that 
the authors of [5] think we did not take our own concerns seriously enough 
to give them force, of course you may do so, but I think we'd want a 
chance to breifly and politely rebut that.

Sorry to make a fuss about this, but there's already a lot of heat and 
confusion around this whole issue, and I'm afraid that the current 
fragmented quotes from member-only emails will make it worse and not 
better.

Noah

[4] http://lists.w3.org/Archives/Member/member-exi-wg/2007Mar/0014.html
[5] http://lists.w3.org/Archives/Member/member-exi-wg/2007Sep/0010.html



--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








"David Orchard" <dorchard@bea.com>
Sent by: www-tag-request@w3.org
10/25/2007 01:12 AM
 
        To:     "W3C-TAG Group WG" <www-tag@w3.org>
        cc:     (bcc: Noah Mendelsohn/Cambridge/IBM)
        Subject:        Review of EXI


Draft TAG response to EXI.  I believe there could be significant negative 
response so I encourage the TAG members to suggest modifications as they 
see fit.
 
The TAG does not consider it's concerns raised in May 2005[1] to be 
addressed.  We are unable to ascertain whether our concerns have been 
addressed or not from the measurements document[2] produced by the working 
group.  It appears that the Working Group believes that non-Working Group 
members must prove the WG wrong in it's decisions on EXI technologies, 
rather than convincing the world that there is indeed a conclusive case.
 
The measurements document is full of 127 pages of data and analysis but we 
did not see any material in a summary or abstract that addressed our 
concerns.   In the almost 3 months since publication of the document on 
July 25th 2007, the EXI working group has had 28 messages on their mailing 
list, none of which are comments on the measurements document.  During the 
Oct 4th TAG telcon, the TAG spent approximately 30 minutes looking through 
the document but were unable to glean answers to our questions.  We 
believe that the measurements document is too detailed and academic to 
prove the need for any flavours of binary XML or the selection of a 
particular technology, and that it is likely that only a handful of people 
will be able to grasp even the data embodied in the document. 
 
The document states "Based on the measurements described here, the working 
group has selected Efficient XML ([EffXML]) to be the basis for the 
proposed encoding specification to be prepared as a candidate W3C 
Recommendation. " There is no high level supporting evidence for such a 
conclusion. 
 
It appears that the working group has presupposed a conclusion that binary 
xml is necessary and no concerns by the TAG or member companies on the 
technology and methodologies themselves or the integration with the rest 
of XML (including DSig) need to be addressed directly.  Some examples:
-  A main author of Efficient XML did press outreach [3] that resulted in 
an article called "W3C is working on a solution to bandwidth-hogging, 
clunky XML".
- IBM's withdrawal [4] from the Working Group shows their detailed and 
specific disagreement with the benefits of EXI and remains unaddressed. 
Just recently a working group member said "The same goes for IBM, if they 
really thought the result tells something important, they would have spent 
some more time establishing a case out the result against EXI. "[5]
- the evolution of "the XML stack" including DSig and Encryption is mostly 
ignored [6].  For example, of all the use cases that EffXML solves, what 
cases require a revision of DSig and/or Encryption before deployment?  It 
may very well be the case that similar to the failure of the incompatible 
evolution of XML 1.1 will be duplicated in the an incompatible evolution 
of EXI without supporting evolution of technologies. 
- the TAG concerns were never directly addressed.
 
We will go through our documented longstanding concerns in order.
 
"The drawbacks are likely to include reduced interoperability with XML
1.0 and XML 1.1 software, and an inability to leverage the benefits of
text-based formats.  "
 
The Working Group has rejected the use of an encoding in the XML 
Declaration because of the 18 or so characters.  We fail to understand why 
this is an issue.  XML 1.0 allows different encodings and it allows 
default encodings to be defined out of band.  Constrained environments 
with out of band knowlege of binary encoding could assume binary encoding 
on incoming messages, failing over to text xml.  Less tightly coupled 
environments, such as the Web, could use the XML declaration encoding. 
This at least would preserve part of the XML stack and technologies.
 
Continuing further, "In particular, we suggest that a quantitative 
analysis is necessary.
For at least a few key use cases, concrete targets should be set for the
size and/or speed gains that would be needed to justify the disruption
introduced by a new format.  For example, a target might be that "in
typical web services scenarios, median speed gains on the order of 3x in
combined parsing and deserialization are deemed sufficient to justify a
new format."  We further suggest that representative binary technologies
be benchmarked and analyzed to a sufficient degree that such speed or
size improvements can be reasonably reliably predicted before we commit
to a Recommendation.  No doubt, any given set of goals or benchmarks
will suffer from some degree of imprecision, but if the gains are
sufficiently compelling to justify a new format, then they should be
relatively easy to demonstrate.  In short, actual measurements should be
a prerequisite to preparing a Recommendation."
 
The Working Group did not establish concrete targets for size and/or speed 
gains that would justify disruption prior to running measurements.  Actual 
measurements have been done but the absence of clear targets for size 
and/or speed gains even after the measurements means that the the 
selection of any technology appears unjustified. 
 
Continuing in the TAG message, "In doing such measurements, we believe it 
is essential that comparisons
be done to the best possible text-based XML 1.x implementations, which
are not necessarily those that are most widely deployed.  Stated
differently: 
if XML 1.x is inherently capable of meeting the needs of users, then our
efforts should go into tuning our XML implementations, not designing new
formats.  Benchmark environments should be as representative as possible
of fully optimized implementations, not just of the XML parser, but of
the surrounding application or middleware stack.  We note that different
application-level optimizations may be necessary to maximize the
performance of the Binary or text cases respectively.  Care should
especially be taken to ensure that the performance of particular APIs
such as DOM or SAX does not obscure the performance possible with either
option (e.g. both SAX and DOM can easily result in high overhead string
conversions when UTF-8 is used.)"
 
The Working Groups call for implementations [7] specifically called for 
only XML parsers, not of the surround application or middleware stacks, 
JDKs or Java Virtual Machines.  The benchmarks have not been against the 
best possible text-based XML 1.x implementations. 
 
The measurements document acknowleges the issue of stack integration in 
"Stack integration considers the full XML processing system, not just the 
parser. By selectively combining the components of the processing stack 
through abstract APIs, the system can directly produce application data 
from the bytes that were read. Two prominent examples of this technique 
are [Screamer] and [EngelenGSOAP]. Both of these can also be called 
schema-derived as they compile a schema into code. However, neither simply 
generates a generic parser, but rather a full stack for converting between 
application data and serialized XML. This gives a significant improvement 
compared to just applying the pre-compilation to the parsing layer. "  But 
neither of these prominent examples appears in the test data. 
 
Further, there were no "real-world end to end" use cases tested, such as a 
Web service application, a mobile application, etc.  Thus we do not know 
the overall effect of any particular technology on the overall application 
performance. 
 
The measurements document states "To begin with, the XBC Characterization 
Measurement Methodologies Note defines thresholds for whether a candidate 
format achieves sufficient compactness " [8].  The XBC Characterization 
Measurement Methodologies Note[8] states "Because XML documents exist with 
a wide variety of sizes, structures, schemas, and regularity, it is not 
possible to define a single size threshold or percentage compactness that 
an XML format must achieve to be considered sufficiently compact for a 
general purpose W3C standard. "
 
We attempted to determine the differences between Efficient XML and Gzip 
but found the methodology confusing.  The measurements document specifies 
that "In the Document and Both classes, candidates are compared against 
gzipped XML, while in the Neither and Schema cases, the comparison was to 
plain XML".   Examining Document and Both compactness graphs, Gzip appears 
to offer improvements over XML that track the other implementations, with 
the noteworthy point that Efficient XML's improvements over Gzip are 
significant in a significant part of the Both chart but similar in the 
Document.   Examining Processing Efficiency graphs, it appears as though 
XML is clearly superior in Java Encoding in Document and Both.  GZip 
appears further inferior but yet all solutions vary wildly around XML in 
Decoding Document and Both.  A worrying statement is "An interesting point 
to note in the decoding results is the shapes of the graphs for each 
individual candidate. Namely, these appear similar to each other, 
containing similar peaks and troughs. Even more interestingly, this is 
also the case with Xals, indicating that there is some feature of the JAXP 
parser that is implemented suboptimally and triggered by a subset of the 
test documents."   The measurements document states "For instance, 
preliminary measurements in the EXI framework indicate that the default 
parser shipped with Java improved noticeably from version 5.0 to version 
6, showing 2-3-fold improvement for some cases. ", and the measurements 
used JDK 1.5.0_05-b05 for Java based parsing and  JDK 1.6.0_02-ea-b01 for 
native.   Perhaps an improved JDK, Java Virtual Machine, or virtualized 
JVM would further improve results. 
These leads us to wonder whether a combination GZip with improved 
technologies such as Parsers, JDKs, VMs, or even Stack Integration 
technology (that is Schema aware and hence covered under Both and Schema) 
would suffice for the community. 
 
Examining the data sets used, there are a number of military applications 
(ASMTF, AVCL, JTLM)  and yet comparatively few generic "Web service" 
applications.  The Google test suite lists Web services for small devices 
and Web services routing; the Invoice test suite lists Intra/InterBusiness 
Communication which immediately limits it's scope to "A large business 
communicates via XML with a number of remote businesses, some of which can 
be small business partners. These remote or small businesses often have 
access only to slow transmission lines and have limited hardware and 
technical expertise. "; and there is a WSDL test suite.    This seems to 
avoid the "common" Web service case of the many Web APIs provided by 
hosted solutions like Google, Yahoo, Amazon, eBay, Salesforce, Facebook, 
MSN, etc.    Examining the test data shows that the Google test cases used 
5 different test cases (0,7,15, 24,30) which includes 1 soap fault (case 
#24).  There are 2 AVCL, 5 Invoice, 8 Location Sightings, 6 JTLM, 5 ASMTF, 
2 WSDL test cases as well.  There appears to be broad based coverage of 
each, though the rationale for the various weightings aren't documented. 
For example, why 4 Google "success cases" and 2 WSDL cases?  Surely there 
are more than 2 times as many SOAP messages than WSDL messages being sent 
around the internet. 
 
In conclusion, W3C TAG's concerns have not been addressed and the W3C TAG 
does not support an incompatible change to the XML Stack. 
 
Cheers,
Dave
 
[1] http://lists.w3.org/Archives/Public/www-tag/2005May/0044
[2] http://www.w3.org/TR/2007/WD-exi-measurements-20070725/
[3] http://www.sdtimes.com/article/latestnews-20071015-10.html
[4] http://lists.w3.org/Archives/Member/member-exi-wg/2007Mar/0014.html
[5] http://lists.w3.org/Archives/Member/member-exi-wg/2007Sep/0010.html
[6] 
http://lists.w3.org/Archives/Member/member-exi-wg/2007Sep/att-0005/00-part
[7] http://lists.w3.org/Archives/Public/public-exi/2006Mar/0004.html
[8] http://www.w3.org/TR/xbc-measurement/
Received on Saturday, 27 October 2007 14:44:21 UTC