Thinking about media types in XML

Since we've started talking about using media types in XML as something
separate from and possibly more generic than just MTOM (or whatever it's
to be called), I think we should step back and consider where this fits in
architecturally. It seems to me that there are a number of approaches one
could take to the problem of associating media types with XML content,
each with different tradeoffs.

1) With a New Attribute
This is the approach taken in the PASWA proposal, and more recently in
Jacek's text. We would define a new media type (e.g.,
x:mediaType="image/jpeg") that indicates a media type for the content that
it's associated with; in instance documents, it would be associated with
an element's content, whilst in schemas it could be associated with
arbitrary content, much as xs:type is.

  * Simple, direct
  * This would require extension of XML Schema and/or WSDL. Current Schema
implementations would not work.
  * Specific to media types; if other type systems need to be used,
they'll need different attributes
  * May be harder to reuse as a datatype system in other schema langauges

2) Reusing XML Schema Types
An alternative is to reuse the XML Schema type system by mapping media
types to Schema types, thereby allowing one to specify a media type
without any changes to Schema itself (e.g., "xs:type='video:mpeg'"). I
wrote up a straw-man proposal (see attached) along these lines a while

  * Leverages existing XML Schema machinery; No change in Schema specs.
  * Able to be used as a datatype system by other schema languages as
  * Some types cannot be expressed as QNames, because of their relatively
constrained syntax (e.g., illegal characters, prohibition of numbers as
first characters, etc.)

3) Defining URI-Based Types
David pointed out that the problems with #2 were caused by the use of
QNames to identify types in XML Schema, and that URIs (e.g.,
"foo:type=''") would not have the same
problem. This approach also has the advantage that it could coexist with
xs:type during a transition period, and eventually stand on its own.

  * Possible to map any type that can be expressed as a URI into XML
  * Might be able to leverage CTURI work
  * Schema types are context-dependent (i.e., no namespace prefix)
  * Would require extension of XML Schema and/or WSDL; current Schema
implementations would not work.

Are there any other approaches we could add?

To me, #2 seems like a nice hack for the short term if we have to work
with the current flavour of Schema. Both of #1 and #3 require a new
attribute and therefore more fundamental changes, so some thought should
be given as to which is better architecturally, and not just for Web


Mark Nottingham

Received on Friday, 18 July 2003 20:18:42 UTC