W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > July to September 2000

RE: Memory requirements of binary data

From: Biron,Paul V <Paul.V.Biron@kp.org>
Date: Tue, 15 Aug 2000 10:31:43 -0700
Message-Id: <376E771642C1D2118DC300805FEAAF4386DDA9@pars-exch-1.ca.kp.org>
To: www-xml-schema-comments@w3.org
> -----Original Message-----
> From:	Ace [SMTP:Ace@AceProgrammer.com]
> Sent:	Monday, August 14, 2000 7:45 PM
> To:	www-xml-schema-comments@w3.org
> Subject:	Memory requirements of binary data
> 
> I have a need to use the binary datatype (encrypting some information in
> the document). It seems that I can encode binary data in two formats (mime
> and hex). My question is this: How many bits does it take to pass 8 bits
> of binary data?
> 
> I've no experience with the mime type, so I can't even take a guess.
> In the case of hex, since the byte is translated to two hex characters,
> does it take 16 or 32 bits (because XML is unicode)?
> 
Sorry about not respoding to your earlier message on this subject.

For hex, each byte of binary data is encoded as two 7bit characters.  For
base64, the relationship between number of bytes of binary data and number
of encoded 7bit characters is variable, but as it says in Section 6.8 of RFC
2045 (where base64 is defined) [1]:

	The encoding and decoding algorithms are simple, but the encoded
data are consistently only about 33 percent larger than the unencoded data. 

So, in general, I'd say that base64 is more "efficient".

However, the number of bytes necessary for either encoding is dependent on
the encoding (e.g., UTF-8 vs. UTF-16) used for the XML entity in question.
Since both hex and base64 use a restricted subset of ASCII, sending them in
a UTF-8 encoded XML entity will only require 1 byte for each 7bit character.
If the XML entity is encoded with UTF-16, then each 7bit character will
require 2 bytes.

Hope this helps,

pvb
Received on Tuesday, 15 August 2000 13:45:29 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:49:53 UTC