Content negotiation issues (was XInclude)

[I sent this yesterday morning but it doesn't appear to have made it to
the list.  Apologies to anyone who receives duplicates.]

Murata Makoto writes:
>1) XInclude ignores the media type (and probably the charset
>   parameter) associated with resources
>
>When parse = "xml" is specified, XInclude always assumes that the
>media type is text/xml and the fragment id is an XPointer.  When parse
>= text" is specified, XInclude always uses text/plain in interpreting
>fragment identifiers.  It is unclear whether or not XInclude respects
>the charset parameter of the original media type.  I would thus argue
>that XInclude conflicts media type RFCs and architectural
>principles of W3C (2.4. Fragment identifiers of "Architecture of the 
>World Wide Web", Draft 15 November 2002).

There seems to be a general issue with XML specifications and
content-negotiation.  While the HTML object element includes a type
attribute that permits such negotiation [1], as do some other aspects of
HTML, SMIL, and SVG, most of the XML-based specifications - including
XInclude and XLink - seem to take a "we provide the URI reference, you
do with it as you like" approach.

The Web Architecture document seems to take a somewhat more conservative
approach, with its "consistent representations"[2] and particularly
"Coneg with fragments" [3] principles.  While the title "Consistent
representations" might be better phrased "Consistently represent the
same resource", it sounds on its face as if "representations should be
consistent", which is very different.  "Coneg with fragments" is
less-developed but seems likely to conflict quickly with pretty much all
of the more advanced features proposed by the various XPointer
specifications whenever content negotiation is possible - which I
suspect many Web architects would agree should be "all of the time".

While these principle don't explicitly reject the more explicit
specification approach of HTML etc., the combination of their (generally
worthwhile) conservatism and the lack of specification of any explicit
mechanisms in the XML specifications to handle content-negotiation
should effectively provide a straitjacket for conscientious XML
developers.

Developers who want to process XInclude generically should not have to
understand the surrounding vocabulary to any great semantic depth, but
they may well not be able to process the XIncludes without basic context
for content-negotiation.   Just "xml" is not enough.  

For example, application/xml or text/xml may not be the representation
to which an XInclude in an SVG document was actually pointing, and
perhaps quite reasonably.  It's not very difficult to imagine an XML
description of a parts assembly sharing a URI with an SVG rendition of
that parts assembly, and knowning whether to ask for applicaation/xml or
image/svg+xml can make a substantial difference in the final results.

It is important to note that this is particularly a problem for cases
(like XInclude) where the semantics of embedding make it much harder to
punt unexpected content to a different handler completely.  Outward
pointers tend to have fewer problems with this set of issues.

Content negotiation is one of the most powerful features made possible
by the Web's separation of identifier from resource but also one which
seems to be given short shrift by XML specifications.  I would strongly
encourage the TAG to at least use content-negotiation as an important
lens for evaluating compatibility with the Web Architecture and perhaps
spend more time thinking about content negotiation as a central pillar
of the Web rather than as a problem-causing corner case.


[1] - http://www.w3.org/TR/xhtml2/mod-object.html#sec_14.1.3. and
http://www.w3.org/TR/html401/struct/objects.html

[2] - http://www.w3.org/2001/tag/2002/webarch-20021206#pr-rep-ambiguity

[3] - http://www.w3.org/2001/tag/2002/webarch-20021206#http-coneg-frag

-- 
Simon St.Laurent
Ring around the content, a pocket full of brackets
Errors, errors, all fall down!
http://simonstl.com -- http://monasticxml.org

Received on Monday, 30 December 2002 09:29:46 UTC