Re: MimeType and charset

Hello Donald,

Many thanks for your detailled answer. I have forwared
your mail to the W3C I18N IG. Some comments also below.

At 10:52 00/03/15 -0500, Donald E. Eastlake 3rd wrote:
> Hi Martin,
> 
> See comments below.
> 
> From:  "Martin J. Duerst" <duerst@w3.org>
> Resent-Date:  Tue, 14 Mar 2000 01:22:15 -0500 (EST)
> Resent-Message-Id:  <200003140622.BAA13930@www19.w3.org>
> Message-Id:  <200003140622.PAA26992@sh.w3.mag.keio.ac.jp>
> X-Sender:  duerst@sh.w3.mag.keio.ac.jp
> Date:  Tue, 14 Mar 2000 14:38:03 +0900
> To:  w3c-ietf-xmldsig@w3.org
> 
> >Dear XMLSig WG,
> >
> >The W3C Internationalization WG and IG (Interest Group)
> >are currently looking at the XMLSig spec to do their
> >Last Call review.
> >
> >We have already come up with a number of issues and hope
> >that we can send them to you soon.
> >
> >One point where we have difficulties because we don't
> >know whether we understand the specification are the
> >MimeType and Charset information on <Transform>.
> >
> >If you can answer or comment to the following questions,
> >this will help us understand what the intent of these
> >things is, so that we can comment on them.
> >
> >- There is MimeType and Charset. What about other
> >  parameters in e.g. an http Content-Type header,
> >  or information of similar function in other headers?
> 
> There was substantial discussion in the WG on how to handle type
> information which may result from de-referencing Location (e.g., if it
> is an HTTP URL) or which may be known by a Transform to be the type of
> its output.  There was a suggestion that any type information from
> de-referencing Location be passed to the initial Transform (if any)
> and that each Transform be able to provide type information concerning
> its output to the next Transform (if any) which accepted that output
> as input.  However, the feeling in the WG was that this was
> unnecessarily complex since it requires not just the data but also
> this parallel type information to be passed along on each stage of
> transformation. There was a feeling that someone signing data should
> know what its type is.  Thus they should be able to specify that type,
> if relevant, to the first Transform and they should similarly know the
> type at each stage in the Transform pipeline (if there is one) and be
> able to specify the type of input to each Transform where such type
> information is needed.

Well, yes, but this means a lot of work, doesn't it?
And a lot of chances to get it wrong. It looks like
the group is caring more about implementation than
users.


> An alternative Transfrom attribute to the MimeType and Charset
> attributes would be one ContentType attribute.  This would be
> more general but runs into encoding problems because a general
> Content-Type header can be expected to include double quotes
> and other special characters.  These can be escaped as entities
> but it's still somewhat complex...

I see.

> I do not think any HTTP headers outside of Content-Type would be
> relevant.  Certainly Content-Transfer-Encoding is a mere artifact of
> the channel over which the data is send, meant to be immediately
> undone when the data is extracted by the receiver.
> Content-Disposition dosen't seem relevant either.  In this case, the
> disposition is always to get digested, possibly after transformation,
> regardless of what the header says.  ...  Are you thinking of any
> particular HTTP header other than Content-Type which you think would
> provide data that a Transform would want to take into account?

Well, there is Content-Language. Not that a transform would have
to take it into account, but it is relevant because of content
negotiation. This is a different problem; I just want to mention
it here; the XMLSig draft doesn't seem to address it.

Also, I'm sure there are other parameters to Content-Type than
'charset' that are relevant, a search through the IANA archives
will easily bring some up.


> >- Identifying/classifying the referenced object when it comes
> >  into the Transform chain may be a problem and may warrant
> >  such parameters, but if there is a need to carry any kind
> >  of information between the transforms, shouldn't this be
> >  handled as an implementation issue? So shouldn't these
> >  parameters be on <Reference>, if anywhere, rather than
> >  on <Transform>?
> 
> If signatures are to interoperate and a Transform can behave
> differently depending on the type of its input and that is handled by
> passing type information from one Transform to another, then this must
> be done in all implementations in an operationally equivalent way or
> they will not interoperate.
> 
> Transforms can change the type information for the data that they
> transform.  So that type information can change with each stage of
> processing and can not be specified just once at the Reference level.
> 
> >- What about replacing such parameters by Transforms with
> >  a defined result? I don't know how this would apply to
> >  MimeType, but it certainly makes sense for Charset.
> 
> This would seem to multiply Transform names almost infinitely for no
> particulag gain over the orthogonal specification of the logical
> Transform to perform and the CharSet or other type characteristic of
> Transform's input.

No, it doesn't multiply transform names. You just need one transform
with two parameters, the character encoding of the input and the
character encoding of the output. (or even just one, the character
encoding of the input, and have the output be UTF-8).

> >- If the resource is retrieved e.g. with HTTP, there will
> >  be three places e.g. for 'charset' information:
> >  - The HTTP header
> 
> The current WG intent is to discard HTTP header type info (except to
> the extent it is automatically used in fragment processing).

This is definitely a bad idea. We are telling everybody to get
their HTTP headers straight, and you just ignore it.


> >  - Some info inside the document
> >  - The info in the Transform
> 
> A Transfrom may get type info from the MimeType and Charset attributes
> and may be able to deduce type info from the document.  It's up to the
> Transform how to resolve any conflicts.

Did you say that you want to have transforms defined in an interoperable
way? If you want that, don't leave it to transforms.

> We could provide default
> guidance.  Note that, as has come up in a separate discussion, the XML
> 1.0 specification with corrections now states that external Charset
> specifications should normally override the indications in an XML
> document.

Well, yes. But if you look at the details, they are not all that
easy. And please note that the XML spec not only says 'external
Charset information' but uses words such as 'delivered with'
or 'acompagnied by', which clearly applies to Mime headers,
but not some info far away somewhere in a different place
such as a signature.


Regards,   Martin.





#-#-#  Martin J. Du"rst, I18N Activity Lead, World Wide Web Consortium
#-#-#  mailto:duerst@w3.org   http://www.w3.org/People/D%C3%BCrst

Received on Wednesday, 15 March 2000 21:39:11 UTC