- From: Paul Grosso <pgrosso@arbortext.com>
- Date: Wed, 23 Mar 2005 12:39:38 -0500
- To: <public-xml-core-wg@w3.org>
PEX5 XML encoding detection in parse="text" ------------------------------------------- The original comment is at http://lists.w3.org/Archives/Public/www-xml-xinclude-comments/2004Dec/00 06 wherein the commentor says: http://www.w3.org/TR/2004/PR-xinclude-20040930/ states in section 4.3: [...] * if the media type of the resource is text/xml, application/xml, or matches the conventions text/*+xml or application/*+xml as described in XML Media Types [IETF RFC 3023], the encoding is recognized as specified in XML, otherwise [...] It is not clear whether this also applies to other Media Types such as "message" or "image", e.g. for Message/Email+XML or image/svg+xml. Please clearly indicate to which types this applies. I am concerned that future revisions of RFC 3023 or the registration of MIME Types that are different from the types registered so far might contradict the requirements of the document, for example, it has been proposed that there is no charset parameter for image/svg+xml, thus, without special knowledge of the image/svg+xml MIME Type, XInclude processors would seem to be required to consider an illegal charset parameter for image/svg+xml resources which would render them non- conforming to the image/svg+xml registration. RFC 3023 might also be revised to make it a fatal error if e.g. a application/xml resource with a charset parameter that is different from the encoding that would be determined by XML rules, it would seem that XInclude would contradict such a requirement. Please include a discussion on how such events will be handled for XInclude. As indicated in http://lists.w3.org/Archives/Public/www-xml-xinclude-comments/2004Dec/00 05.html the processing of the encoding attribute seems not well-defined. For example, HTTP/1.1 requires that implementations determine for all text/* resources without a charset parameter ISO-8859-1 encoding, this means that for all text/* resources an encoding can be determined without further processing of the content, thus from the first definition of the encoding attribute it would seem that the encoding attribute is ignored for all text/* types. One could read this section however so that this is not considered external encoding information and thus the encoding attribute would apply to e.g. text/plain resources. A good first step to improve the definition of the attribute would be to reference section 4.3 for the definition of the attribute rather than defining it in two places. It is not clear how text/xml resources without a charset parameter are to be processed, the text is, again, [...] * if the media type of the resource is text/xml, application/xml, or matches the conventions text/*+xml or application/*+xml as described in XML Media Types [IETF RFC 3023], the encoding is recognized as specified in XML, otherwise [...] Processing text/xml resources according to XML would mean to process the resource as if it were application/xml which would be inconsistent with RFC 3023. Please state clearly what the actual processing requirements are and indicate clearly whether this is consistent with MIME, HTTP/1.1, and RFC 3023. Note that RFC 3023 contradicts HTTP/1.1 as described in RFC 3023. This would include to provide a more precise definition of what is considered "external encoding information". Please include a strong warning that this processing can yield in choosing the wrong encoding e.g. for many resources as inline encoding information or type specific defaults are ignored.
Received on Wednesday, 23 March 2005 18:06:41 UTC