Re: question: Increasing factor for XML vs Binary from Stephen D. Williams on 2004-11-18 (public-xml-binary@w3.org from November 2004)

From: Stephen D. Williams <sdw@lig.net>
Date: Thu, 18 Nov 2004 12:36:14 -0500
To: Mike Champion <mc@xegesis.org>
Cc: public-xml-binary@w3.org
Message-ID: <419CDD8E.7060902@lig.net>
I've been talking about this approach for a long time, at least since 
1998 and especially during the last two years relative to esXML.  I 
guess it was buried with a lot of other ideas so it wasn't noticed by 
most.  I can't remember seeing anyone else proposing it for XML or for a 
generalized binary format, but the idea of course has a long history in 
uses such as Slip/PPP header compression, application binary patches by 
differencing, and copy on write (COW) volumes and filesystems.

The "Delta" property explains this relative to our W3C work; it may not 
be published quite yet.

sdw

Mike Champion wrote:

>Yes, I just became aware if this appriach yesterday at Michael LeventhalLs talk at XML 2004.  IIRC he indicated that he knew of no studies applying it to XML.
>
>I can see this as very valuable in certain scenarios, but U need help envisioning how it could be a generic binary XML format.
>-----Original Message-----
>From:  Stephen D. Williams
>Date:  11/18/04 11:43 am
>To:  Mike Champion 
>Cc:  Silvia.De.Castro.Garcia@esa.int,  public-xml-binary@w3.org
>Subj:  Re: question: Increasing factor for XML vs Binary
>
>One thing that is missing from a lot of these analyses is what could be 
>saved by being able to do deltas.  In a situation where there is any 
>kind of repetition such as protocol messages (in XMPP), records of some 
>kind in a stream or file, or a request/response, the ability to send 
>only what's different efficiently may use less CPU and be more efficient 
>than even schema-based solutions.
>
>I plan to benchmark and demonstrate this kind of solution soon.  There 
>is a way to use the idea of a delta in a way that is very schema-like, 
>but isn't so firmly tied to a schema.  Use in a 'header compression' 
>style is even more powerful although it is somewhat more entangled in 
>the semantics of the application.
>
>sdw
>
>Mike Champion wrote:
>
>  
>
>>Sigh most of that was lost somewhere ... I'm on a handheld ...
>>
>>I'll interperet this as 'how much of a compression factor can be achieved by using a binary vs XML encoding of the same data.'  The usual answer, I'm afraid: it depends.  As best I recall from a literature survey:
>>
>>larger docs compress better than small,
>>
>>you can get more compression if you use more CPU (and hence battery) power,
>>
>>you can get very good compression if you assume that the schema is known to both sides and docs are valid instances,.
>>
>>My recollection is that 5:1 compression is realistic for arbitrary XML and 10:1 and higher is feasible with shared schemas.
>>
>>
>>-----Original Message-----
>>From:  Silvia.De.Castro.Garcia@esa.int
>>Date:  11/4/04 8:56 am
>>To:  public-xml-binary@w3.org
>>Subj:  question: Increasing factor for XML vs Binary 
>>
>>Hi all,
>>       I would like to know the estimation order of the increasing factor 
>>for the XML format respect to the equivalent binary product, I mean, which 
>>is the order of the overload that will supose using XML instead of binary 
>>format?
>>
>>Thank you very much,
>>Best regards,
>>
>>Silvia de Castro.
>>
>>
>> 
>>
>>    
>>
>
>
>  
>


-- 
swilliams@hpti.com http://www.hpti.com Per: sdw@lig.net http://sdw.st
Stephen D. Williams 703-724-0118W 703-995-0407Fax 20147-4622 AIM: sdw
Received on Friday, 19 November 2004 05:21:22 UTC