- From: Robin Berjon <robin.berjon@expway.fr>
- Date: Wed, 11 Dec 2002 20:46:47 +0100
- To: www-tag <www-tag@w3.org>
Hi, as DanC aptly pointed it out, discussions on binary infosets tend to quickly revolve around similar structures, leading to sterile stalemates. In order to help avoid that, I have summarized a little information, in part stemming from the recent xml-dev thread on the topic[1]. I won't go into specifics about existing formats, preferring instead to limit the scope of this post to what binary infosets are and what in my experience people expect from them. It's Not XML ============ Despite the language that is colloquially used to describe them ("binary XML"), binary infosets are *not* XML, and no one is pretending they are. They don't intend to compete with XML. XML is more than just a way to serialize an infoset (in fact that's a backwards way of seeing it), binary infosets on the other hand are just that. It just so happens that some applications only need to pass around an infoset and in order to do that there are cases in which a highly efficient serialization of an infoset is a highly desirable way , if not the only way, of doing it (due to resource constraints most of the time). Typical Features ================ Being on the receiving end of feature requests for binary infosets, I've seen a relatively wide set of needs be expressed. The requirements tend to vary according to whether one is dealing with mobile, embedded, broadcast, web services, etc people, but they often overlap accross communities (if only because those sectors do). Here is a quick list off the top of my head: - Size. Binary infosets ought to be as compact as possible. - Speed. They should be faster to read than parsing XML is, and thus than generic compression of XML is. - Genericity. They should be applicable to any infoset. Requiring a schema is often OK, but should not be needed in all cases (and even given a schema, it is generally required that arbitrary extensions be includable without prior definition). - Memory Efficiency. It should be possible to use the binary infoset as an in-memory representation of a DOM (or similar) with lazy decoding of the content. - Streamability. It should be possible to produce a binary infoset stream that can be picked up at an arbitrary position and still made sense of. This functionality may require that the binary infoset be split up into subtree fragments that can be independently understood (this is easier than it usually sounds to people unused to stream applications). - Skippability. It is often desirable to skip entire subtrees either because you don't need them, or because you know you won't understand them, with minimal cost (ie without parsing the subtree). - Change Resilience. In the case of schema-constrained infosets, new versions of the schema should not require applications using the old schema information to be upgraded, even in case of radical change it should be possible to send the information to both new and old applications. This relies on the previous feature. - Fault Tolerance. If a fragment is lost during transmission, it should impact the result as little as possible. This is linked to streamability. For different communities the above requests will have different rationales, but those are generally the things that I see. It's Already Happening ====================== People are already creating binary formats for the infoset, and either ratifying them as part of larger standards (MPEG, TV Anytime, 3GPP...) or using them within their own projects. The latter is not much of a concern to me, but I see a problem with the former because: - in a number of cases, the binary format is not used within a closed and controlled context, but rather in an open, Web-related situation; - the format is most of the time ad hoc, and limited to that vertical industry consortium's standard(s); - the format is most of the time encumbered; - ad hoc formats are of varying quality, to say the least. This is progressively leading us to a balkanized situation in which I can very well imagine that if I wanted to send XHTML+SVG to a device, I'd have to use different binary encodings for XHTML and for SVG specified separately by two different consortia (possibly creating my own kludge to wrap both). I'll also have to pay royalties to do that, and it might be a poor format with serious issues. Chances are that making that balkanised mess interoperable will be difficult, when not impossible. [1] http://lists.xml.org/archives/xml-dev/200212/threads.html#00159 http://lists.xml.org/archives/xml-dev/200212/threads.html#00175 -- Robin Berjon <robin.berjon@expway.fr> Research Engineer, Expway 7FC0 6F5F D864 EFB8 08CE 8E74 58E6 D5DB 4889 2488
Received on Wednesday, 11 December 2002 14:47:20 UTC