- From: Vogelheim, Daniel <daniel.vogelheim@siemens.com>
- Date: Tue, 7 Aug 2007 16:47:16 +0200
- To: "Costello, Roger L." <costello@mitre.org>, <public-exi@w3.org>
Hello Roger, Thanks for your interest in EXI! Before I proceed to your actual questions, let me point you to the EXI Measurements Note, which we re-published just after the EXI format draft: http://www.w3.org/TR/2007/WD-exi-measurements-20070725/ In order to have an objective basis for future decisions, the EXI WG spent significant time measuring candidate implementations over as wide a selection of test data as we could assemble before deciding on a base format. Sample implementations include nearly all you have asked about, including XML plain, XML + gzip, FastInfoset, ASN.1 with PER and BER encoding rules. The results are summarily documented in the Measurements Note and, if desired, raw test reports are available directly through W3C's CVS repository. As far as actual performance & compactness is concerned, this data set probably provides a much better (reproducible, documented, and unbiased) base for your own decisions than any advice I could give you. But, of course, I do know that in the real world many additional factors (e.g. installed base) tend to influence or dominate the decision making. It's just that I can't really help you with that. In addition to the Measurements Note, the EXI WG intents to publish a 'Best Practices' note, which may provide additional guidance. The exact content of this document is not fixed yet, as it is still in the making. Now, on to your questions: > Below are various choices for sending XML across the wire. Would this > working group provide guidance on when to use each choice? and the > advantages and disadvantages of each choice? As explained above, the EXI WG has provided performance and compactness measurements, and probably will provide additional information. However, I suspect we won't ever publish something that will exactly answer your questions in the way you pose them, as a complete evaluation would involve many factors outside of WG access or control. Some (personal!) comments on your choices: > 1. Send the XML document as is, [...] without any compression [...] If XML does what you want, at costs that you (and your communication partners) find acceptable, then use it. In this case, look no further. > 2. Compress the XML document using a compression tool such as > WinZip or Bzip, [...] Note that general purpose compression is generally a trade-off, which buys you one property (compactness) at the expense of another (processing). While that is acceptable for some cases, it is not so much for others. EXI allows to improve both, simultaneously. Furthermore, the EXI measurements show EXI to quite consistently out-perform XML over gzip (using deflate, the same format/algorithm WinZip usually uses), in both compaction and speed. So as far as XML transmission is concerned, I would consider EXI to be the more general solution: In every use case where compression does well, EXI seems to do better. In the many uses cases where compression is not acceptable, EXI still provides many benefits. Of course, the installed base of deflate/gzip/zip may turn out to be a strong argument. > 3. Use the compression capabilities inherent in HTTP (gzip content > encoding, i.e. http+gzip) I think the HTTP issue is orthogonal to your other questions. My hope is that EXI could be registered as an HTTP content encoding, too, so it would work seamlessly with HTTP's built-in content negotiation just as http+gzip does. Except better, as explained above. :-) This will hopefully be covered in the 'Best Practices' note, once available. > 4. Encode the XML document as an ASN.1 BER, DER, or PER file [...] I am very certain that using ASN.1 for XML transmission will be popular in all applications that already employ ASN.1 for any of its other virtues, e.g. in any non-XML context. My understanding is that ASN.1 hasn't been too well received in the XML community at large, presumably because it usually requires exact adherence to the schema and also drops non-declared content (such as comments, PIs, namespace declarations, ignorable whitespace, etc.) While this is largely a non-issue in the ASN.1 world, it tends to be a tad unpopular with the XML crowd. EXI, being a true XML technology, can encode the full XML InfoSet and supports arbitrary schema deviations. Due to some similarities in the content models of EXI and ASN.1, I would expect EXI to play nicely with ASN.1. My hope is that the ASN.1 tool vendors will embrace EXI, so ASN.1 and EXI can cooperate, rather than making this an either-or issue. Additionally... > 4. Encode the XML document as an ASN.1 BER, DER, or PER file [...] > 5. Encode the XML document as a Fast Infoset file, [...] Both FastInfoset and ASN.1 (PER (aligned) and BER) where measured as part of the EXI WG group effort. The chosen EXI baseline has consistently won on compactness against all of them, and usually won or came close to the leading candidate in processing time. (The measurements showed a leading group of several candidates, whose relative performance was usually contained within a relatively small band, with various members of said group taking 1st place once in a while. I suspect implementation aspects, such as the time spent optimizing the respective implementations, were greater factors in determining the exact placement than actual format properties.) Please refer to the EXI Measurements Note for details. > 6. Encode the XML as an Efficient XML Interchange file, and then send > the EXI file. Well now, I personally think that is an *excellent* plan! :-) Of course, having said all that, I have to point out that the caveat I voiced initially still applies: Any real world deployment decision will have to involve a number of factors beyond the actual format capability. As far as capability goes, we hope people will carefully examine and repeat our measurements and will come to the same conclusions as we did. Roger, if you think there is anything the EXI WG can do to help people make such a decision, please let us know! Sincerely, Daniel Vogelheim
Received on Tuesday, 7 August 2007 14:48:17 UTC