ACTION-308 (part 2) Updates to 'The Self-Describing Web' from John Kemp on 2010-01-04 (www-tag@w3.org from January 2010)

From: John Kemp <john@jkemp.net>
Date: Mon, 4 Jan 2010 15:46:33 -0500
To: "www-tag@w3.org WG" <www-tag@w3.org>
Message-Id: <CBD3C035-E06C-43C5-8D28-C6EDB3D40DC4@jkemp.net>

Hello,

As the second part of ACTION-308, I propose the following updates to 'The Self-Describing Web' finding [SelfDescWeb], to acknowledge the reality of content-type sniffing. I shall now mark ACTION-308 to be 'pending review'.

Regards,

- johnk

[SelfDescWeb] - http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html
[ACTION-308] - http://www.w3.org/2001/tag/group/track/actions/308
[F2FMinutesSep2009] - http://www.w3.org/2001/tag/2009/09/24-minutes#item03

(begin proposed changes)

Section 1: Introduction

After bullet point:

Each representation should include standard machine-readable indications, such as HTTP Content-type headers, XML encoding declarations, etc., of the standards and conventions used to encode it.

Add:

... and every effort should be made to ensure that the intentions of the content author and publisher regarding interpretation of the content are accurately conveyed in such indications.

Section 2: The Web's Standard Retrieval Algorithm

After paragraph:

Consider instead a different example, in which Bob clicks on a link to ftp://example.com/todaysnews. Although Bob's browser can easily open an FTP connection to retrieve a file, there is no way for the browser to reliably determine the nature of the information received. Even if the URI were ftp://example.com/todaysnews.html the browser would be guessing if it assumed that the file's contents were HTML, since no normative specification ensures that data from ftp URIs ending in .html is in any particular format.

Add:

As noted above, and for other reasons (such as content aggregation), it may not be possible for a browser to reliably determine, via inspection of a Content-Type HTTP header or other external metadata alone, the intended interpretation of Web content. In such cases, a browser may inspect the content directly (commonly known as "sniffing"). The consequences of such an action are described in [AuthoritativeMetadata]. In particular, sniffing Web content should only be done using an accepted and secure algorithm, such as [BarthSniff].

References:

Add:

[BarthSniff] http://tools.ietf.org/html/draft-abarth-mime-sniff-03

Received on Monday, 4 January 2010 20:47:03 UTC