Re: MimeType and charset

Hi Martin,

See comments below.

From:  "Martin J. Duerst" <duerst@w3.org>
Resent-Date:  Tue, 14 Mar 2000 01:22:15 -0500 (EST)
Resent-Message-Id:  <200003140622.BAA13930@www19.w3.org>
Message-Id:  <200003140622.PAA26992@sh.w3.mag.keio.ac.jp>
X-Sender:  duerst@sh.w3.mag.keio.ac.jp
Date:  Tue, 14 Mar 2000 14:38:03 +0900
To:  w3c-ietf-xmldsig@w3.org

>Dear XMLSig WG,
>
>The W3C Internationalization WG and IG (Interest Group)
>are currently looking at the XMLSig spec to do their
>Last Call review.
>
>We have already come up with a number of issues and hope
>that we can send them to you soon.
>
>One point where we have difficulties because we don't
>know whether we understand the specification are the
>MimeType and Charset information on <Transform>.
>
>If you can answer or comment to the following questions,
>this will help us understand what the intent of these
>things is, so that we can comment on them.
>
>- There is MimeType and Charset. What about other
>  parameters in e.g. an http Content-Type header,
>  or information of similar function in other headers?

There was substantial discussion in the WG on how to handle type
information which may result from de-referencing Location (e.g., if it
is an HTTP URL) or which may be known by a Transform to be the type of
its output.  There was a suggestion that any type information from
de-referencing Location be passed to the initial Transform (if any)
and that each Transform be able to provide type information concerning
its output to the next Transform (if any) which accepted that output
as input.  However, the feeling in the WG was that this was
unnecessarily complex since it requires not just the data but also
this parallel type information to be passed along on each stage of
transformation. There was a feeling that someone signing data should
know what its type is.  Thus they should be able to specify that type,
if relevant, to the first Transform and they should similarly know the
type at each stage in the Transform pipeline (if there is one) and be
able to specify the type of input to each Transform where such type
information is needed.

An alternative Transfrom attribute to the MimeType and Charset
attributes would be one ContentType attribute.  This would be
more general but runs into encoding problems because a general
Content-Type header can be expected to include double quotes
and other special characters.  These can be escaped as entities
but it's still somewhat complex...

I do not think any HTTP headers outside of Content-Type would be
relevant.  Certainly Content-Transfer-Encoding is a mere artifact of
the channel over which the data is send, meant to be immediately
undone when the data is extracted by the receiver.
Content-Disposition dosen't seem relevant either.  In this case, the
disposition is always to get digested, possibly after transformation,
regardless of what the header says.  ...  Are you thinking of any
particular HTTP header other than Content-Type which you think would
provide data that a Transform would want to take into account?

>- Identifying/classifying the referenced object when it comes
>  into the Transform chain may be a problem and may warrant
>  such parameters, but if there is a need to carry any kind
>  of information between the transforms, shouldn't this be
>  handled as an implementation issue? So shouldn't these
>  parameters be on <Reference>, if anywhere, rather than
>  on <Transform>?

If signatures are to interoperate and a Transform can behave
differently depending on the type of its input and that is handled by
passing type information from one Transform to another, then this must
be done in all implementations in an operationally equivalent way or
they will not interoperate.

Transforms can change the type information for the data that they
transform.  So that type information can change with each stage of
processing and can not be specified just once at the Reference level.

>- What about replacing such parameters by Transforms with
>  a defined result? I don't know how this would apply to
>  MimeType, but it certainly makes sense for Charset.

This would seem to multiply Transform names almost infinitely for no
particulag gain over the orthogonal specification of the logical
Transform to perform and the CharSet or other type characteristic of
Transform's input.

>- If the resource is retrieved e.g. with HTTP, there will
>  be three places e.g. for 'charset' information:
>  - The HTTP header

The current WG intent is to discard HTTP header type info (except to
the extent it is automatically used in fragment processing).

>  - Some info inside the document
>  - The info in the Transform

A Transfrom may get type info from the MimeType and Charset attributes
and may be able to deduce type info from the document.  It's up to the
Transform how to resolve any conflicts.  We could provide default
guidance.  Note that, as has come up in a separate discussion, the XML
1.0 specification with corrections now states that external Charset
specifications should normally override the indications in an XML
document.

>  If they differ, what's the priority? Please note that
>  already for the first two, things are rather complicated,
>  and adding a third one won't make things easier.

See above.

>Looking forward to discussing these issues,
>
>Regards,   Martin.
>
>#-#-#  Martin J. Du"rst, I18N Activity Lead, World Wide Web Consortium
>#-#-#  mailto:duerst@w3.org   http://www.w3.org/People/D%C3%BCrst

Thanks,
Donald

PS: feel free to foward this to other mailing lists where you think
that is appropriate.
===================================================================
 Donald E. Eastlake 3rd                    dee3@torque.pothole.com
 65 Shindegan Hill Rd, RR#1                 lde008@dma.isg.mot.com
 Carmel, NY 10512 USA     +1 914-276-2668(h)    +1 508-261-5434(w)

Received on Wednesday, 15 March 2000 10:51:58 UTC