- From: Alex Milowski <alex@milowski.org>
- Date: Thu, 17 Jun 2010 16:15:43 +0100
- To: public-xml-processing-model-wg@w3.org
On Thu, Jun 17, 2010 at 4:07 PM, Henry S. Thompson <ht@inf.ed.ac.uk> wrote:>
> 2) When addressed via <p:document href="..."/> or <p:load
> href="..."/>? Hmmm. We sort of blew that, in that the spec. is
> silent as to how the Content-Type header plays wrt the
> requirement that the retrieved represention be "a well-formed XML
> document" [1]. What if the Content-Type were image/jpeg ? Should
> you go ahead and try to parse it as XML anyway?
>
> Assuming the answer is 'yes', then I think the situation is clear
> -- RFC3023 [2] says explicitly that in the case of text/xml, if
> there is no Charset, then you _must_ assume US-ASCII:
>
> "This example shows text/xml with the charset parameter omitted.
> In this case, MIME and XML processors MUST assume the charset is
> "us- ascii", the default charset value for text media types
> specified in [RFC2046]. The default of "us-ascii" holds even if
> the text/xml entity is transported using HTTP.
...and then it says:
(Note: There is an
inconsistency between this specification and HTTP/1.1, which uses
ISO-8859-1[ISO8859] as the default for a historical reason. Since
XML is a new format, a new default should be chosen for better
I18N. US-ASCII was chosen, since it is the intersection of UTF-8
and ISO-8859-1 and since it is already used by MIME.)
So, they arm wrestle?
--
--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."
Bertrand Russell in a footnote of Principles of Mathematics
Received on Thursday, 17 June 2010 15:16:23 UTC