Does p:data preserve charset value for text content types? from Toman_Vojtech@emc.com on 2009-07-17 (xproc-dev@w3.org from July 2009)

From: <Toman_Vojtech@emc.com>
Date: Fri, 17 Jul 2009 03:53:50 -0400
To: <xproc-dev@w3.org>
Message-ID: <6E216CCE0679B5489A61125D0EFEC78710367CA4@CORPUSMX10A.corp.emc.com>

Hi,

When answering the "charset with unescape-markup" question, I came
across the following issue with p:data. What is the expected result of
the following:

<p:data href="file.txt" content-type="text/plain;charset=windows-1252"/>

Is it (1):

<c:result content-type="text/plain;charset=windows-1252">...</c:result>

Or (2):

<c:result content-type="text/plain">...</c:result>

I am thinking that perhaps (2) (charset information removed) is more
correct *for text types* because the text has been converted to a
sequence of Unicode characters and is not in windows-1252 any more. 

If you look at the test data-002 in the XProc test suite, it expect the
charset to be there, but I wonder if it is really correct.

But perhaps this is not a problem at all since the result is not base64
encoded for text types so the charset information will be (will it?)
always ignored. Plus you may want to present the original charset in the
content type.

Regards,
Vojtech

Received on Friday, 17 July 2009 07:55:38 UTC