Re: let authors choose text/html or application/xhtml+xml (detailed review of section 1. Introduction) from Robert Burns on 2007-08-31 (public-html@w3.org from August 2007)

From: Robert Burns <rob@robburns.com>
Date: Fri, 31 Aug 2007 12:59:33 -0500
To: Roy T.Fielding <fielding@gbiv.com>
Cc: Dean Edridge <dean@55.co.nz>, Dan Connolly <connolly@w3.org>, "public-html@w3.org WG" <public-html@w3.org>
Message-Id: <045760C9-1E75-4275-B6B7-741A5C1111C3@robburns.com>

Hi Roy,

On Aug 31, 2007, at 12:31 PM, Roy T. Fielding wrote:

>
> On Aug 31, 2007, at 8:01 AM, Robert Burns wrote:
>>> One of the main reasons for this is because the W3C hasn't made  
>>> it clear to developers and browser manufacturers that it's the  
>>> media-type ("application/xhtml+xml") that people need to get used  
>>> to, not just the XML syntax of XHTML, and it's the media-type  
>>> that makes the document XHTML.
>>
>> We've been discussing this at length on the "review of content  
>> type rules by IETF/HTTP community"  thread (see also the wiki page  
>> [1]). I think a more accurate way to think of it is that a file's  
>> type is determined by the internals of the file and the authoring  
>> tool.
>
> No, that is the completely wrong way to think of it.  Media types
> define how a given sequence of bytes are intended to be processed
> by the recipient.  I can author dozens of types in vim.  It is
> impossible to determine the media type of content by sniffing.
> It is sometimes possible to determine a range of possible media
> types and pick one based on configuration, but there are always
> exceptions that will cause such a pick to be wrong.

I'm not sure what I said conflicts with what you're saying. My point  
is that an author and the tool the author uses creates a file of a  
certain type (even before it reaches an HTTP server). No sniffing is  
necessary at this stage because the author and authoring tool  
combination already know the type of file they're creating. As you  
said "I can author dozens of types in vim". And you are the one in  
charge of deciding what type you're authoring. You may be saving it  
to disk with each edit and each time the HTML file you're authoring  
is made available as a PNG file through an http daemon. Does that  
misconfigured server say anything about the file type you're  
authoring in vim? No and it shouldn't

However, the other issue is how can we efficiently store and transmit  
the information that you and your authoring tool know first hand.  
That's a separate issue. On Mac OS X, the file you authored is  
expressed as a UTI like "public.png". However, Mac OS X supports  
alternate means of "tagging" the file as a public.png. It can be done  
through a filename extension like '.png', it can be done through a  
type code (aka OSType), or it can be done through a media type tag  
such as "image/png". The UTI represents the abstract name of a file's  
type while the filename extension, the media  type and the OSType all  
represent different ways of tagging a file as that type.

> If you are going to make rules for sniffing, you need to be honest
> about the nature of that beast -- no matter what you define, it will
> be wrong some percentage of the time.  It is the user's choice to
> determine when that is acceptable, not the choice of a standard.

Sniffing is certainly a problem. However, browsers vendors are  
finding sniffing to be more reliable than content-type headers. So  
there's problems with sniffing and there's problems in the process of  
affixing and retaining the author/authoring tool intended media type  
to a file.

Finally, I think the other issue that needs to be separated (and  
again Mac OS X's UTIs develop this distinction nicely) is that a file  
may be of one specific type, but an author or user may want it  
handled as a different type that the file conforms to. So a file may  
be an XHTML1 file, but the author or the user wants that file handled  
as a text file that it also conforms to. It doesn't cease to be an  
XHTML conforming file simply because the author or user want it  
handled as a text file. Up to now the only way we've had to convey an  
authors wishes to treat a file as a different type than the type it  
was authored as is to change the media type. However, if the media  
type is also the only place where we store metadata about the file's  
actual type, then we have to obliterate that metadata (the file's  
actual type metadata) to set the other metadata (the handle as type  
metadata).

Take care,
Rob

Received on Friday, 31 August 2007 18:00:10 UTC