- From: Tim Bray <tbray@textuality.com>
- Date: Mon, 21 Jan 2002 11:27:21 -0800
- To: www-tag@w3.org
At the TAG telecon this morning, I agreed to do a write-up on some of the issues around [nsMediaType-3], including core issues and problematic corner cases. I believe the TAG probably has consensus on these principles: P1. Media types are an important part of the web architecture; dispatching on them, when possible, is efficient and robust and well-understood. P2. When processing XML resources, dispatching to software modules on the basis of namespaces is desirable and correct behavior. P3. Clearly in many cases context matters: you can't in the general case reach into the middle of a resource and safely process some element based only on its namespace. P4. The namespace of the root element of an XML resource has a special status, if only because it provides the outermost level of context. Agreeing on all this doesn't make the problems go away. Here are some that arise - maybe they're corner/pathological cases that can be overlooked, but they should be considered: C1. As demonstrated by the example of XSLT on this list, the namespace of a root element can be misleading. It has been suggested that the same problem is likely to show up in XQuery. C2. Namespace processing obviously becomes more relevant in the case where the resource is served as text/xml or application/xml. There is currently no consensus as to whether or when it's desirable to serve resources with either of these media types. C3. The issue has been raised of whether MIME headers or media types are useful in signaling the makeup of XML resources which contain markup from multiple namespaces. There's no consensus on this issue. C4. There is the possibility of inconsistency between the media type and what the namespace says. This is a specific case of a more general problem of what happens when there's an inconsistency between any of the MIME headers and anything about the document content. Here are three examples that illustrate both the general and specific problem: - simple obvious inconsistency, e.g. a server sends a resource with media type text/xhtml+xml, but the root element has a namespace declaration saying it's SVG - a slight variation where the resource in the SVG namespace is sent with a media-type of application/xml. - certain browsers have been known to sniff into resource content and decide to render as HTML [or not] based on whether there's an internal subset, or whether the first few hundred bytes have tags that "look like" HTML. - there is the whole isssue of the charset header. This has spawned huge volumes of debate that I won't reproduce here - the basic problem comes from the fact that a conformant XML processor can with very high probability determine the correct encoding of a resource by reading it. What then if the server (a) sends an incorrect charset header, or (b) transcodes the resource so that the XML self-description is wrong (allowed for text/* resources) - this is particularly nasty when the XML processor uses the charset parameter to read the doc, but then breaks it by saving it in its non-self-describing form. It should be pointed out that IETF considers (correctly) that there are security issues raised whenever a software module steps outside the bounds set by the MIME headers. Cheers, Tim
Received on Monday, 21 January 2002 14:27:27 UTC