ISSUE: MIME boundaries not checked during XOP serialisation from Alex Danilo on 2004-06-23 (xmlp-comments@w3.org from June 2004)

From: Alex Danilo <alex@research.canon.com.au>
Date: Wed, 23 Jun 2004 10:56:36 +1000
To: xmlp-comments@w3.org
Message-Id: <20040623005636.A5229569F@ivory.research.canon.com.au>

ISSUE:

The creation of a XOP package as described in section 3.1 and
illustrated in the example 1.2 converts original base64 encoded
data into raw binary octets.

It is entirely likely that a large enough sample of binary data
will result in content that encodes the same sequence of octets
that mark a MIME boundary for the output package.

Such aliasing of MIME boundary octets will result in a broken
XOP package that cannot be decoded back into the original XML
infoset.

Proposed solutions:
1) Scan and detect serialised binary data for the chosen MIME
   boundary separator and define an escaping mechanism to
   recode the binary data.
2) Scan and detect serialised binary data for the chosen MIME
   boundary separator and on detection choose a different
   MIME boundary string and rescan all binary attachments
   iteratively until no aliasing is detected.
3) Mandate use of the 'Content-Length:' header field in the
   binary part of the XOP package.

Proposed resolution:
(3) - Mandate use of 'Content-Length:' in any binary encoded
parts of the XOP package.

Reasoning:
Content-Length is defined in an RFC and parsing the decimal value
for reading the required number of octets in the binary data is
trivial.  Also, detection of the subsequent MIME separator
after the declared number of octets is similarly trivial.

Also, CIP4 use this encoding scheme in JDF aggregates for
sending print jobs to typesetters and the like.  i.e. it is
already used in industry successfully.

(1) is rejected as cumbersome to implement.  (2) is rejected
as requiring excessive buffering and being inefficient in
implementation.

Alex Danilo (SWG-WG member)

Received on Tuesday, 22 June 2004 20:57:18 UTC