Re: ACTION-308 (part 2) Updates to 'The Self-Describing Web' from Eric J. Bowman on 2010-01-06 (www-tag@w3.org from January 2010)

From: Eric J. Bowman <eric@bisonsystems.net>
Date: Tue, 5 Jan 2010 19:33:27 -0700
To: John Kemp <john@jkemp.net>
Cc: "www-tag@w3.org WG" <www-tag@w3.org>
Message-Id: <20100105193327.7a3c4631.eric@bisonsystems.net>

Is ACTION-308 saying that the common implementation of XSLT on the Web
is wrong, and that all the major browsers are broken?  Most XSLT-based
Web systems out there follow the longstanding model described here:

http://w3schools/xsl/
http://www.w3schools.com/xsl/cdcatalog_with_xsl.xml

Take an .xml document, embed an XML PI for the XSLT transformation,
serve as text/xml, and voila!  Works in all major browsers, none of
which consult the user about the privilege escalation consequences of
treating the XSLT output as text/html, right down to allowing
Javascript to execute, despite the authoritative type being text/xml.

If this is so, does serving the .xml document as application/xml suffer
the same fate?  Consider:

http://www.w3.org/TR/MathML2/overview.xml

Not that they're using Javascript, but there's no reason it wouldn't
work if they did.  Is the TAG saying that HTML content must be served
with an HTML media type?

-Eric

John Kemp wrote:
>
> Hello,
> 
> As the second part of ACTION-308, I propose the following updates to
> 'The Self-Describing Web' finding [SelfDescWeb], to acknowledge the
> reality of content-type sniffing. I shall now mark ACTION-308 to be
> 'pending review'.
> 
> Regards,
> 
> - johnk
> 
> [SelfDescWeb] -
> http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html
> [ACTION-308] - http://www.w3.org/2001/tag/group/track/actions/308
> [F2FMinutesSep2009] -
> http://www.w3.org/2001/tag/2009/09/24-minutes#item03
> 
> (begin proposed changes)
> 
> 1.
> 
> Section 1: Introduction
> 
> After bullet point:
> 
> Each representation should include standard machine-readable
> indications, such as HTTP Content-type headers, XML encoding
> declarations, etc., of the standards and conventions used to encode
> it. 
> 
> Add:
> 
> ... and every effort should be made to ensure that the intentions of
> the content author and publisher regarding interpretation of the
> content are accurately conveyed in such indications.
> 
> 2.
> 
> Section 2: The Web's Standard Retrieval Algorithm
> 
> After paragraph:
> 
> Consider instead a different example, in which Bob clicks on a link
> to ftp://example.com/todaysnews. Although Bob's browser can easily
> open an FTP connection to retrieve a file, there is no way for the
> browser to reliably determine the nature of the information received.
> Even if the URI were ftp://example.com/todaysnews.html the browser
> would be guessing if it assumed that the file's contents were HTML,
> since no normative specification ensures that data from ftp URIs
> ending in .html is in any particular format. 
> 
> Add:
> 
> As noted above, and for other reasons (such as content aggregation),
> it may not be possible for a browser to reliably determine, via
> inspection of a Content-Type HTTP header or other external metadata
> alone, the intended interpretation of Web content. In such cases, a
> browser may inspect the content directly (commonly known as
> "sniffing"). The consequences of such an action are described in
> [AuthoritativeMetadata]. In particular, sniffing Web content should
> only be done using an accepted and secure algorithm, such as
> [BarthSniff].
> 
> 3.
> 
> References:
> 
> Add:
> 
> [BarthSniff] http://tools.ietf.org/html/draft-abarth-mime-sniff-03

Received on Wednesday, 6 January 2010 02:34:26 UTC