- From: Mike Champion <mike.champion@softwareag-usa.com>
- Date: Wed, 19 Feb 2003 13:42:31 -0500
- To: www-tag@w3.org
This is a useful starting point. My major suggestion is that "Binary XML" should be treated merely as a label for this cluster of issues, and not as the outline of a solution. I think it would be more effective to think of the problem here as *optimized* representations of the information in an XML document (or serializations of the Infoset, if you will). Beginning with a discussion of optimization makes one ask the question "what property is being optimized." Chris Lilley has a nice list of properties that alternative XML serializations could address, including Network Efficiency, Storage Efficiency, Data typing, Random Access, and Interoperability. [I must confess that the Trust Boundaries issue makes no sense to me in this context, but that's another discussion]. XML 1.0 appears in retrospect to be a nice compromise amoong all these criteria, obviously weighting interoperability the heaviest but not doing too much violence to the others (or allowing them to be layered on, e.g. datatype information inserted by a schema validator). Chris seems to be arguing that no "binary" format could optimize all these simultaneously, and that's certainly true. A more interesting question, which I think is in the TAG's domain, is whether the XML 1.0 compromise, one-serialization-fits-all approach is the *only* one that the W3C should standardize. Would there be benefit in blessing others that optimize some other property or set of compatible properties? For example, let's say that a "space-optimized" version of XML was standardized (it might well be a gzip of XML 1.x), and a "speed- optimized" version was also available. Via HTTP content negotiation or out-of-band agreement, producers and consumers of XML could decide which format to use in a particular application or network/processing context. The downside of that would be that interoperability would suffer -- an application that didn't understand the format negotiation mechanism might get an "XML" format that it didn't recognize. (Making the default format unicode-with-angle-brackets would probably handle most of these issues in practice, but the guaranteed, universally interoperable *principle* would clearly be violated.) The upside is of standardization is the network effect -- a small number of standardized formats than a large number of proprietary and non-interoperable formats increases the overall "value" of the system by maximizing the number of nodes that can *efficiently* interoperate. So, the discussion points for the TAG would seem to be: - Is the conception of alternative standardized representations of the XML Infoset that optimize specific properties (such as parsing speed *or* network bandwidth consumption) consistent with the overall Web architecture? [I personally think so ... it seems little different in concept than offering SVG, GIF, JPEG, and PDF representations of an image; you could also ask for SVG (XML 1.x), SVG-fast (optimized for fast parsing) , SVG-compressed, etc. ] - Would the benefits of alternative "standard" serializations outweigh the costs in complexity / interoperability? [Good question, but I would tend to say "yes" for the usual reasons: this stuff is happening out there in the wild, and better to try to coordinate and standardize within a common architectural framework than to have one or more of these things create a "fork" that destroys interop anyway.] - Should some specific WG or CG be asked to look into alternative standardizations of the Infoset serialization format? [I personally doubt it; there isn't much interest or expertise in this inside the W3C that I'm aware of, but if some outside group does something like this within the constraints of the Webarch, you could say it would be a Good Thing.] -- Mike Champion
Received on Wednesday, 19 February 2003 13:43:32 UTC