RE: Base64 -- do we really want/need line breaks every 76 characters?

There is an additional matter to consider, which is that RFC 2045 says
that Base64 encoding may include characters not in the table below and
that these should be ignored.  

It says:

                    Table 1: The Base64 Alphabet

     Value Encoding  Value Encoding  Value Encoding  Value Encoding
         0 A            17 R            34 i            51 z
         1 B            18 S            35 j            52 0
         2 C            19 T            36 k            53 1
         3 D            20 U            37 l            54 2
         4 E            21 V            38 m            55 3
         5 F            22 W            39 n            56 4
         6 G            23 X            40 o            57 5
         7 H            24 Y            41 p            58 6
         8 I            25 Z            42 q            59 7
         9 J            26 a            43 r            60 8
        10 K            27 b            44 s            61 9
        11 L            28 c            45 t            62 +
        12 M            29 d            46 u            63 /
        13 N            30 e            47 v
        14 O            31 f            48 w         (pad) =
        15 P            32 g            49 x
        16 Q            33 h            50 y

   The encoded output stream must be represented in lines of no more
   than 76 characters each.  All line breaks or other characters not
   found in Table 1 must be ignored by decoding software.




-----Original Message-----
From: Joseph M. Reagle Jr. [mailto:reagle@w3.org] 
Sent: Wednesday, May 30, 2001 9:35 AM
To: Martin Duerst; cmsmcq@w3.org
Cc: w3c-ietf-xmldsig@w3.org; www-xml-schema-comments@w3.org
Subject: Re: Base64 -- do we really want/need line breaks every 76
characters?


I spoke to Michael Sperberg-McQueen about this (as co-Chair of Schema,
and 
as Chair of XMLCG on the the canonical definition of DTD [a]) at
EuropeXML 
and he agreed to round up a response on both questions.

[a]
http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001AprJun/0103.htm
l

Michael, as discussed, can/should [b] be read as defined by SOAP?

>  The SOAP 1.1 submission [2] removes the line length limitation in 
>their
> use of
>Base64; Section 5.4.3 of SOAP reads as follows:
>
>    The recommended representation of an opaque array of bytes is the
>    'base64' encoding defined in XML Schemas [10][11], which uses the
>    base64 encoding algorithm defined in 2045 [13]. However, the line
>    length restrictions that normally apply to base64 data in MIME do
>    not apply in SOAP. A "SOAP-ENC:base64" subtype is supplied for use
>    with SOAP.

[b] http://www.w3.org/TR/xmlschema-2/#base64Binary

At 08:01 5/24/2001 +0900, Martin Duerst wrote:
>After seeing all the discussion, I'm okay with long lines as such. But 
>there is still the problem that XML Schema doesn't allow that, because 
>it references RFC 2045 directly, without anything else. This is a 
>problem on both sides:
>
>- XML Signature cannot use the XML Schema datatype as it stands
>   (and extension or restriction won't work here)
>- XML Schema should consider changing their definition of
>   Base64 to include longer lines, because it seems that that's
>   widely used in practice. Whether that can be done as a corrigendum
>   to Schema 1.0 or whether that has to go into Schema 1.1, I don't
>   know.
>
>I have copied www-xml-schema-comments. Schema experts, please see the 
>other messages in this thread 
>(http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001AprJun/0183.h
>tml).
>
>Regards,   Martin.
>
>
>At 17:31 01/05/22 -0700, Brian LaMacchia wrote:
>>Folks--
>>
>>Currently, XMLDSIG references RFC 2045 (one of the MIME RFCs) for a 
>>definition of Base64 encoding/decoding.  (See section 6.8 of [1].)  It

>>has been pointed out to me that RFC 2045 *requires* that 
>>Base64-encoded content have line breaks at least every 76 characters.

>>Paragraph 6 reads as follows:
>>
>>    The encoded output stream must be represented in lines of no more
>>    than 76 characters each.  All line breaks or other characters not
>>    found in Table 1 must be ignored by decoding software.  In base64
>>    data, characters other than those in Table 1, line breaks, and
other
>>    white space probably indicate a transmission error, about which a
>>    warning message or even a message rejection might be appropriate
>>    under some circumstances.
>>
>>I can't see any reason for XMLDSIG to inherit a line-length limitation

>>that appears to have been there for mail-specific reasons.  The SOAP 
>>1.1 submission [2] removes the line length limitation in their use of 
>>Base64; Section 5.4.3 of SOAP reads as follows:
>>
>>    The recommended representation of an opaque array of bytes is the
>>    'base64' encoding defined in XML Schemas [10][11], which uses the
>>    base64 encoding algorithm defined in 2045 [13]. However, the line
>>    length restrictions that normally apply to base64 data in MIME do
>>    not apply in SOAP. A "SOAP-ENC:base64" subtype is supplied for use
>>    with SOAP.
>>
>>I propose that XMLDSIG adopt language similar to SOAP and not require 
>>applications to insert line breaks at least every 76 characters. 
>>(Conforming implementation will still accept line-limited encodings 
>>since they have to ignore any found whitespace in the Base64 string.)
>>
>>                                         --bal
>>
>>[1] http://www.ietf.org/rfc/rfc2045.txt
>>[2] http://www.w3.org/TR/SOAP/


--
Joseph Reagle Jr.                 http://www.w3.org/People/Reagle/
W3C Policy Analyst                mailto:reagle@w3.org
IETF/W3C XML-Signature Co-Chair   http://www.w3.org/Signature
W3C XML Encryption Chair          http://www.w3.org/Encryption/2001/

Received on Wednesday, 30 May 2001 15:00:44 UTC