W3C home > Mailing lists > Public > public-xml-processing-model-comments@w3.org > October 2011

Re: Charsets, encodings, http-request, unescape-markup, and convenience, oh my!

From: Norman Walsh <ndw@nwalsh.com>
Date: Mon, 10 Oct 2011 08:57:27 -0400
To: public-xml-processing-model-comments@w3.org
Message-ID: <m2listovgo.fsf@nwalsh.com>
"vojtech.toman@emc.com" <vojtech.toman@emc.com> writes:
>> 2. If the
>> charset parameter isn't known/specified, we default to...ISO Latin 1,
>> or
>>    whatever the Internet tells us the default is for text/* documents
>> that don't
>>    specify a charset.
> I think so. You already get this behavior when you read text data,
> except that the applied default charset is not available anywhere in
> the constructed c:body.

I was doing some "totally off the reservation" hacking this morning
and I think we have to be a little more careful about the wording.
Consider application/json for example, even if the charset isn't
specified, the charset is always UTF-8.

I still think we should allow implementations to guess/know/infer the
encoding if it isn't specified, but we have to be a little careful.

                                        Be seeing you,

Norman Walsh
Lead Engineer
MarkLogic Corporation
Phone: +1 413 624 6676

Received on Monday, 10 October 2011 12:58:04 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:28:27 UTC