Re: encoding and charset from Norman Walsh on 2009-12-11 (public-xml-processing-model-wg@w3.org from December 2009)

From: Norman Walsh <ndw@nwalsh.com>
Date: Fri, 11 Dec 2009 18:38:28 -0500
To: public-xml-processing-model-wg@w3.org
CC: Alex Milowski <alex@milowski.org>
Message-ID: <m2fx7hqdxn.fsf@nwalsh.com>

"Toman_Vojtech@emc.com" <Toman_Vojtech@emc.com> writes:
> I also wonder whether the new multipart tests are correct.

So do I! Alex, we still need you to weigh in here with the results of
your research.

> <c:body content-type="text/plain" encoding="utf-8" description="Some
> descriptive text">Hello World</c:body>
>
> I wonder if using utf-8 in the @encoding attribute makes any sense. As
> far as I understand, c:body/@encoding controls how to decode the
> c:body
> data *before* formulating the request, and not which encoding to use
> when sending the data. Section 7.1.10.2 says:
>
> "The encoding attribute controls the decoding of the element content
> for
> formulating the body. A value of base64 indicates the element's
> content
> is a base64 encoded string whose byte stream should be sent as the
> message body"
>
> So, if I understand the above correctly, I don't see how specifying
> utf-8 as the encoding can work. I mean, c:body always contains a
> sequence of characters, and these characters have already been decoded
> (by the parser) using the encoding of the owner XML document. What
> would
> be the meaning of:
>
> <?xml version="1.0" encoding="iso-8859-1"?>
> ...
> <c:body content-type="text/plain" encoding="utf-8">Hello
> World</c:body>
> ...

You're right. I've clearly got something wrong in the encoding part.

And just FYI: XML Calabash doesn't pass most of these tests, so it's safe
to assume that I've made other mistakes.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | Everything the same; everything
http://nwalsh.com/            | distinct.

Received on Friday, 11 December 2009 23:39:15 UTC