Re: media types encoded as unicode in c:http-request response conversion

On Fri, Jul 18, 2008 at 6:02 AM, Norman Walsh <ndw@nwalsh.com> wrote:
> / "Alex Milowski" <alex@milowski.org> was heard to say:
> | I think this sentence in 7.1.9.4:
> |
> | "If the media type of the response is a text type with a charset parameter
> | that is a Unicode character encoding, the content of the constructed c:body
> | element is the translation of the text into a Unicode character sequence"
> |
> | should read:
> |
> | "If the media type of the response is a text type with a charset parameter
> | that is a Unicode character encoding or is recognized as a non-XML
> | media type whose contents are encoded as a sequence of Unicode characters
> | (e.g. it has a character parameter or the definition of the media type is such
> | that it requires Unicode), the content of the constructed c:body element is the
> | translation of the text into a Unicode character sequence."
>
> I'm fine with that (in fact, I made the change :-), but can you
> clarify what "a charset parameter that is a Unicode character
> encoding" means?
>
> Do you mean something that conforms to the Unicode Character Encoding
> Model, http://unicode.org/reports/tr17/ ?

Yes.  Unicode character encodings have standard names.  We shouldn't
expect an implementation to understand random charset parameter values.  If
they do, I don't think we should preclude them as long as the result is a
sequence of Unicode characters.

>
> If so, shouldn't that be referenced explicitly?

Yes.  I'd like to add a reference rather than try to enumerate the values in our
specification.

We should probably also say that UTF-8 is required to be supported.


-- 
--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."

Bertrand Russell in a footnote of Principles of Mathematics

Received on Friday, 18 July 2008 14:07:15 UTC