Re: Slightly Off topic - Dublin core value

> > Q: When declaring the value of DC.format for "traditional" web pages, two
 
> > Choices: http://www.iana.org/assignments/media-types/

Traditional web pages should have an IANA text type because they should
be easily readable as plain text.  Modern web pages might more
accurately be described as application/javascript+html (an unassigned
type) because they are really javascript programs using HTML as 
an output formatting mechanism.

A non-HTML, but MIME aware, email program will display text type 
documents but is forced to save to disk for application ones.  For
that reason, I'd suggest that there is a need for text/XHTML, for
XHTML 2.0 documents that are pure XHTML 2.0.  (Note that best 
practice is to send a text/plain alternative, or, in many cases,
as the only version.)

As this is the accessibility list, I'm not sure that we should be
encouraging formats that don't gracefully degrade to plain text
handling.

multipart is generally only used for special formats defined by the
MIME standards themselves.

> Application is most appropriate, and this is why application/xhtml+xml 
> is now the MIME type used. text/html was used up until XHTML1.0 which is 

The +xml indicates that the document is more than XHTML, which means
it is unlikely to be suitable for direct human consumption.  That is
not true of well written traditional HTML.

> browsers that react to application/xhtml+xml by considering in a non-XML 
> XML document can be made to behave well by assigning an XSLT 
> transformation that does an identity transformation - as far as those 
> browsers are concerned it received an XML document and then used it to 
> produce an HTML document, which coincidentally looks exactly the same as 
> the XML document it received :)

But presumably they then generate an HTML DOM, so the parse tree is not
the same, in the general case, as would have been produced by XHTML browser.

> Mixed is not for cases like HTML were a document references other items 
> which are then mixed together in rendering, but for document types where 
> other items are included in the document itself (e.g. a word processor 
> document with images embedded into them, rather than linked to from them).

Multipart formats are generally de-interleaved and a matrix document and
its images would be multipart/related.  multipart/mixed is the simple
mail attachment case.

(HTML documents with images are in a funny category.  From a hypertext
point of view, I would consider the images to be a special form of 
link - and text only browsers treat them exactly like that - but most
authors actually treat the HTML and all the images as a single compound
document (actually the whole site is often treated as one).)

Received on Friday, 2 February 2007 08:36:11 UTC