- From: Alex Milowski <alex@milowski.org>
- Date: Thu, 17 Jun 2010 16:15:43 +0100
- To: public-xml-processing-model-wg@w3.org
On Thu, Jun 17, 2010 at 4:07 PM, Henry S. Thompson <ht@inf.ed.ac.uk> wrote:> > 2) When addressed via <p:document href="..."/> or <p:load > href="..."/>? Hmmm. We sort of blew that, in that the spec. is > silent as to how the Content-Type header plays wrt the > requirement that the retrieved represention be "a well-formed XML > document" [1]. What if the Content-Type were image/jpeg ? Should > you go ahead and try to parse it as XML anyway? > > Assuming the answer is 'yes', then I think the situation is clear > -- RFC3023 [2] says explicitly that in the case of text/xml, if > there is no Charset, then you _must_ assume US-ASCII: > > "This example shows text/xml with the charset parameter omitted. > In this case, MIME and XML processors MUST assume the charset is > "us- ascii", the default charset value for text media types > specified in [RFC2046]. The default of "us-ascii" holds even if > the text/xml entity is transported using HTTP. ...and then it says: (Note: There is an inconsistency between this specification and HTTP/1.1, which uses ISO-8859-1[ISO8859] as the default for a historical reason. Since XML is a new format, a new default should be chosen for better I18N. US-ASCII was chosen, since it is the intersection of UTF-8 and ISO-8859-1 and since it is already used by MIME.) So, they arm wrestle? -- --Alex Milowski "The excellence of grammar as a guide is proportional to the paucity of the inflexions, i.e. to the degree of analysis effected by the language considered." Bertrand Russell in a footnote of Principles of Mathematics
Received on Thursday, 17 June 2010 15:16:23 UTC