base64 encoding

XML Schema provides an XML representation for binary data using the base64
encoding which enables, among other things, binary data to be conveyed in
SOAP messages (albeit with some overhead). Various clarifications have been
requested from XML Schema regarding the exact nature of this
representation. In pursuit of answers to these requests, XML schema has
decided that:
i) the base64 binary encoding will _not_ require a newline or whitespace
character every 76 characters.
ii) the lexical space of characters for the base64 binary encoding will be
limited to the 65 characters in the base64 alphabet (a-z A-Z 0-9 +/=) and
XML 1.0 whitespace characters.

This decision will be published shortly in a document from the XML Schema
WG.

Also, the XML Schema WG asks the XML Protocol WG to provide feedback on the
following:
1) A strict reading of RFC 2045 (Base64) is that if the "=" character
appears in a base64 encoded message, it must be in the last position or the
last position and the next-to-last position, and the number of characters
in the message (including any "=" character) must be a multiple of 4. Given
this reading, should these rules be enforced as part of XML Schema lexical
validation?
2) Should there be a canonical form for XML Schema's base64 encoding?
Example canonical form #1: a blank is inserted after each 4 or 8 characters
(like hex). Example canonical form #2: a newline character after every 76
characters.

If you (XML Protocol WG members, and non-members) have feedback on (1)
and/or (2)-- especially if you have implementation experience with base64
encodings and SOAP-- please send your comments to this list. The XML
Protocol WG will be responsible for collecting/evaluating this feedback and
reporting it back to the XML Schema WG.


............................................
David C. Fallside, IBM
Ext Ph: 530.477.7169
Int  Ph: 544.9665
fallside@us.ibm.com

Received on Thursday, 12 July 2001 22:40:15 UTC