W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > December 2009

RE: encoding and charset

From: <Toman_Vojtech@emc.com>
Date: Fri, 11 Dec 2009 08:51:06 -0500
Message-ID: <997C307BEB90984EBE935699389EC41C4F3AEE@CORPUSMX70C.corp.emc.com>
To: <public-xml-processing-model-wg@w3.org>
Norm,

Regarding your question about encoding/missing charset (as discussed in
the last confcall), I also wonder whether the new multipart tests are
correct.

The tests specify the multipart bodies for making the request like this:

<c:body content-type="text/plain" encoding="utf-8" description="Some
descriptive text">Hello World</c:body>

I wonder if using utf-8 in the @encoding attribute makes any sense. As
far as I understand, c:body/@encoding controls how to decode the c:body
data *before* formulating the request, and not which encoding to use
when sending the data. Section 7.1.10.2 says:

"The encoding attribute controls the decoding of the element content for
formulating the body. A value of base64 indicates the element's content
is a base64 encoded string whose byte stream should be sent as the
message body"

So, if I understand the above correctly, I don't see how specifying
utf-8 as the encoding can work. I mean, c:body always contains a
sequence of characters, and these characters have already been decoded
(by the parser) using the encoding of the owner XML document. What would
be the meaning of:

<?xml version="1.0" encoding="iso-8859-1"?>
...
<c:body content-type="text/plain" encoding="utf-8">Hello World</c:body>
...

?

Regards,
Vojtech
Received on Friday, 11 December 2009 13:51:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 11 December 2009 13:51:50 GMT