Re: Add encoding to the output of p:data (proposal)

Per the minutes of the 2 July meeting, I propose the following changes:

1. In 5.14 p:data:

   Exactly how the data is encoded depends on the media type of the resource.
   If the resource has a content type associated with it (e.g., if the
   resource was retrieved with HTTP), then that content type must be used,
   otherwise, if the user specified a content-type on the p:data, then that
   content type should be assumed. If no content type was specified or is
   associated with the resource, the inferred content type is
   [1394]implementation-dependent.

     * If the media type of the response is an XML media type or text type
       with a charset parameter that is a Unicode character encoding (per
       [[1395]Unicode TR#17]) or is recognized as a non-XML media type whose
       contents are encoded as a sequence of Unicode characters (e.g. it has
       a charset parameter or the definition of the media type is such that
       it requires Unicode), the data must be encoded as Unicode character
       sequence.

     * If the media type is not an appropriate text type, or if the processor
       does not recognize the media type, the content is base64-encoded.

   The resulting data is wrapped in an element with the name specified in the
   wrapper attribute (or c:data if no wrapper is specified).

   The wrapper element should have a content-type attribute which indicates
   the specified or inferred media type of the resource. If the content was
   base64-encoded, it must have an encoding attribute which specifies
   “base64”.

   If a content-type or encoding attribute is specified on a c:data wrapper,
   it must not be in a namespace; if the user-specified wrapper is not
   c:data, then the attributes must be in the http://www.w3.org/ns/xproc-step
   namespace.

   Implementations may record additional details in [1396]extension
   attributes.

And as a new last paragraph in 5.14:

   Some steps, such as p:xquery and p:validate-with-relax-ng, are
   designed to process non-XML inputs. If a base64-encoded input
   occurs in such a context, it should be decoded before processing.
   In this way, for example, an XQuery document can be read with
   p:data and passed to the p:xquery step without regard to how the
   data was encoded by p:data.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | Formal symbolic representation of
http://nwalsh.com/            | qualitative entities is doomed to its
                              | rightful place of minor significance in
                              | a world where flowers and beautiful
                              | women abound.--Albert Einstein

Received on Thursday, 6 August 2009 12:46:55 UTC