Re: Base64 -- do we really want/need line breaks every 76 characters?

I spoke to Michael Sperberg-McQueen about this (as co-Chair of Schema, and 
as Chair of XMLCG on the the canonical definition of DTD [a]) at EuropeXML 
and he agreed to round up a response on both questions.

[a] http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001AprJun/0103.html

Michael, as discussed, can/should [b] be read as defined by SOAP?

>  The SOAP 1.1 submission [2] removes the line length limitation in their 
> use of
>Base64; Section 5.4.3 of SOAP reads as follows:
>
>    The recommended representation of an opaque array of bytes is the
>    'base64' encoding defined in XML Schemas [10][11], which uses the
>    base64 encoding algorithm defined in 2045 [13]. However, the line
>    length restrictions that normally apply to base64 data in MIME do
>    not apply in SOAP. A "SOAP-ENC:base64" subtype is supplied for use
>    with SOAP.

[b] http://www.w3.org/TR/xmlschema-2/#base64Binary

At 08:01 5/24/2001 +0900, Martin Duerst wrote:
>After seeing all the discussion, I'm okay with long lines as such.
>But there is still the problem that XML Schema doesn't allow that,
>because it references RFC 2045 directly, without anything else.
>This is a problem on both sides:
>
>- XML Signature cannot use the XML Schema datatype as it stands
>   (and extension or restriction won't work here)
>- XML Schema should consider changing their definition of
>   Base64 to include longer lines, because it seems that that's
>   widely used in practice. Whether that can be done as a corrigendum
>   to Schema 1.0 or whether that has to go into Schema 1.1, I don't
>   know.
>
>I have copied www-xml-schema-comments. Schema experts, please see
>the other messages in this thread
>(http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001AprJun/0183.html).
>
>Regards,   Martin.
>
>
>At 17:31 01/05/22 -0700, Brian LaMacchia wrote:
>>Folks--
>>
>>Currently, XMLDSIG references RFC 2045 (one of the MIME RFCs) for a
>>definition of Base64 encoding/decoding.  (See section 6.8 of [1].)  It
>>has been pointed out to me that RFC 2045 *requires* that Base64-encoded
>>content have line breaks at least every 76 characters.  Paragraph 6
>>reads as follows:
>>
>>    The encoded output stream must be represented in lines of no more
>>    than 76 characters each.  All line breaks or other characters not
>>    found in Table 1 must be ignored by decoding software.  In base64
>>    data, characters other than those in Table 1, line breaks, and other
>>    white space probably indicate a transmission error, about which a
>>    warning message or even a message rejection might be appropriate
>>    under some circumstances.
>>
>>I can't see any reason for XMLDSIG to inherit a line-length limitation
>>that appears to have been there for mail-specific reasons.  The SOAP 1.1
>>submission [2] removes the line length limitation in their use of
>>Base64; Section 5.4.3 of SOAP reads as follows:
>>
>>    The recommended representation of an opaque array of bytes is the
>>    'base64' encoding defined in XML Schemas [10][11], which uses the
>>    base64 encoding algorithm defined in 2045 [13]. However, the line
>>    length restrictions that normally apply to base64 data in MIME do
>>    not apply in SOAP. A "SOAP-ENC:base64" subtype is supplied for use
>>    with SOAP.
>>
>>I propose that XMLDSIG adopt language similar to SOAP and not require
>>applications to insert line breaks at least every 76 characters.
>>(Conforming implementation will still accept line-limited encodings
>>since they have to ignore any found whitespace in the Base64 string.)
>>
>>                                         --bal
>>
>>[1] http://www.ietf.org/rfc/rfc2045.txt
>>[2] http://www.w3.org/TR/SOAP/


--
Joseph Reagle Jr.                 http://www.w3.org/People/Reagle/
W3C Policy Analyst                mailto:reagle@w3.org
IETF/W3C XML-Signature Co-Chair   http://www.w3.org/Signature
W3C XML Encryption Chair          http://www.w3.org/Encryption/2001/

Received on Wednesday, 30 May 2001 12:34:59 UTC