RE: Base64 -- do we really want/need line breaks every 76 cha racters?

It looks to me as if we need to respecify Base64 instead of relying on the
RFC 2045 spec.

I have thought for quite a while about writing a Base32 spec. There are many
circumstances in which a compact alphanumeric encoding is required but case
cannot be differentiated (e.g. reading over the telephone). It is also good
to get rid of numbers that can be mistaken for letters (0, 1) and the
corresponding letters (O, l, I).

We could write a draft for B64 and B32. B64 would be identical to Base64 but
without the line break characters and B32 would be new.

	Phill
 
Phillip Hallam-Baker FBCS C.Eng.
Principal Scientist
VeriSign Inc.
pbaker@verisign.com
781 245 6996 x227


> -----Original Message-----
> From: Andrew Layman [mailto:andrewl@microsoft.com]
> Sent: Wednesday, May 30, 2001 2:49 PM
> To: Joseph M. Reagle Jr.; Martin Duerst; cmsmcq@w3.org
> Cc: w3c-ietf-xmldsig@w3.org; www-xml-schema-comments@w3.org
> Subject: RE: Base64 -- do we really want/need line breaks every 76
> characters?
> 
> 
> There is an additional matter to consider, which is that RFC 2045 says
> that Base64 encoding may include characters not in the table below and
> that these should be ignored.  
> 
> It says:
> 
>                     Table 1: The Base64 Alphabet
> 
>      Value Encoding  Value Encoding  Value Encoding  Value Encoding
>          0 A            17 R            34 i            51 z
>          1 B            18 S            35 j            52 0
>          2 C            19 T            36 k            53 1
>          3 D            20 U            37 l            54 2
>          4 E            21 V            38 m            55 3
>          5 F            22 W            39 n            56 4
>          6 G            23 X            40 o            57 5
>          7 H            24 Y            41 p            58 6
>          8 I            25 Z            42 q            59 7
>          9 J            26 a            43 r            60 8
>         10 K            27 b            44 s            61 9
>         11 L            28 c            45 t            62 +
>         12 M            29 d            46 u            63 /
>         13 N            30 e            47 v
>         14 O            31 f            48 w         (pad) =
>         15 P            32 g            49 x
>         16 Q            33 h            50 y
> 
>    The encoded output stream must be represented in lines of no more
>    than 76 characters each.  All line breaks or other characters not
>    found in Table 1 must be ignored by decoding software.
> 
> 
> 
> 
> -----Original Message-----
> From: Joseph M. Reagle Jr. [mailto:reagle@w3.org] 
> Sent: Wednesday, May 30, 2001 9:35 AM
> To: Martin Duerst; cmsmcq@w3.org
> Cc: w3c-ietf-xmldsig@w3.org; www-xml-schema-comments@w3.org
> Subject: Re: Base64 -- do we really want/need line breaks every 76
> characters?
> 
> 
> I spoke to Michael Sperberg-McQueen about this (as co-Chair of Schema,
> and 
> as Chair of XMLCG on the the canonical definition of DTD [a]) at
> EuropeXML 
> and he agreed to round up a response on both questions.
> 
> [a]
> http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001AprJu
> n/0103.htm
> l
> 
> Michael, as discussed, can/should [b] be read as defined by SOAP?
> 
> >  The SOAP 1.1 submission [2] removes the line length limitation in 
> >their
> > use of
> >Base64; Section 5.4.3 of SOAP reads as follows:
> >
> >    The recommended representation of an opaque array of bytes is the
> >    'base64' encoding defined in XML Schemas [10][11], which uses the
> >    base64 encoding algorithm defined in 2045 [13]. However, the line
> >    length restrictions that normally apply to base64 data in MIME do
> >    not apply in SOAP. A "SOAP-ENC:base64" subtype is 
> supplied for use
> >    with SOAP.
> 
> [b] http://www.w3.org/TR/xmlschema-2/#base64Binary
> 
> At 08:01 5/24/2001 +0900, Martin Duerst wrote:
> >After seeing all the discussion, I'm okay with long lines as 
> such. But 
> >there is still the problem that XML Schema doesn't allow 
> that, because 
> >it references RFC 2045 directly, without anything else. This is a 
> >problem on both sides:
> >
> >- XML Signature cannot use the XML Schema datatype as it stands
> >   (and extension or restriction won't work here)
> >- XML Schema should consider changing their definition of
> >   Base64 to include longer lines, because it seems that that's
> >   widely used in practice. Whether that can be done as a corrigendum
> >   to Schema 1.0 or whether that has to go into Schema 1.1, I don't
> >   know.
> >
> >I have copied www-xml-schema-comments. Schema experts, 
> please see the 
> >other messages in this thread 
> >(http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001Apr
> Jun/0183.h
> >tml).
> >
> >Regards,   Martin.
> >
> >
> >At 17:31 01/05/22 -0700, Brian LaMacchia wrote:
> >>Folks--
> >>
> >>Currently, XMLDSIG references RFC 2045 (one of the MIME RFCs) for a 
> >>definition of Base64 encoding/decoding.  (See section 6.8 
> of [1].)  It
> 
> >>has been pointed out to me that RFC 2045 *requires* that 
> >>Base64-encoded content have line breaks at least every 76 
> characters.
> 
> >>Paragraph 6 reads as follows:
> >>
> >>    The encoded output stream must be represented in lines 
> of no more
> >>    than 76 characters each.  All line breaks or other 
> characters not
> >>    found in Table 1 must be ignored by decoding software.  
> In base64
> >>    data, characters other than those in Table 1, line breaks, and
> other
> >>    white space probably indicate a transmission error, 
> about which a
> >>    warning message or even a message rejection might be appropriate
> >>    under some circumstances.
> >>
> >>I can't see any reason for XMLDSIG to inherit a line-length 
> limitation
> 
> >>that appears to have been there for mail-specific reasons.  
> The SOAP 
> >>1.1 submission [2] removes the line length limitation in 
> their use of 
> >>Base64; Section 5.4.3 of SOAP reads as follows:
> >>
> >>    The recommended representation of an opaque array of 
> bytes is the
> >>    'base64' encoding defined in XML Schemas [10][11], 
> which uses the
> >>    base64 encoding algorithm defined in 2045 [13]. 
> However, the line
> >>    length restrictions that normally apply to base64 data 
> in MIME do
> >>    not apply in SOAP. A "SOAP-ENC:base64" subtype is 
> supplied for use
> >>    with SOAP.
> >>
> >>I propose that XMLDSIG adopt language similar to SOAP and 
> not require 
> >>applications to insert line breaks at least every 76 characters. 
> >>(Conforming implementation will still accept line-limited encodings 
> >>since they have to ignore any found whitespace in the 
> Base64 string.)
> >>
> >>                                         --bal
> >>
> >>[1] http://www.ietf.org/rfc/rfc2045.txt
> >>[2] http://www.w3.org/TR/SOAP/
> 
> 
> --
> Joseph Reagle Jr.                 http://www.w3.org/People/Reagle/
> W3C Policy Analyst                mailto:reagle@w3.org
> IETF/W3C XML-Signature Co-Chair   http://www.w3.org/Signature
> W3C XML Encryption Chair          http://www.w3.org/Encryption/2001/
> 

Received on Wednesday, 30 May 2001 16:58:55 UTC