Re: draft finding on Authoritative Metadata from Roy T. Fielding on 2006-03-08 (www-tag@w3.org from March 2006)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Tue, 7 Mar 2006 18:24:58 -0800
To: W3C TAG <www-tag@w3.org>
Message-Id: <066A9493-4DCF-416C-8592-764A774055B1@gbiv.com>
I have updated the finding on Authoritative Metadata to reflect all of
the comments received and finished the section on distributed authoring.

This version:
     http://www.w3.org/2001/tag/doc/mime-respect-20060307
Latest version:
     http://www.w3.org/2001/tag/doc/mime-respect

I also took the liberty to reformat it a bit to use the style of
constraints found in webarch and the "rule of least power" finding.
A content-only diff from the 5 Dec 2005 version is enclosed below.

Note that the TAG has not yet approved this version as a finding,
though I expect that will happen at the next teleconference.

This completes my final two action items as a TAG member.

Have fun,

Roy T. Fielding                            <http://roy.gbiv.com/>
Chief Scientist, Day Software              <http://www.day.com/>

===================================================================

diff -u -r1.52 mime-respect.xml
--- mime-respect.xml	5 Dec 2005 08:19:42 -0000	1.52
+++ mime-respect.xml	8 Mar 2006 01:49:31 -0000
@@ -68,19 +68,22 @@
media type of the representation, which influences the dispatching
of handlers and security-related decisions made by recipients of the
message.  In this finding, we review the architectural design choice
-that metadata provided in a received message be considered  
authoritative.
-We examine why recipient behavior that fails to respect authoritative
-metadata can be harmful and under what conditions (user consent) such
-behavior is allowed.  Finally, we consider how specification authors
-should incorporate these design constraints into their work.</p>
+that metadata provided in an encapsulating container, such as the  
metadata
+provided in the header fields of a received message, be considered
+authoritative.  We examine why recipient behavior that fails to respect
+authoritative metadata can be harmful and under what conditions such
+behavior is allowed.  Finally, we consider how specification authors  
and
+implementers should incorporate these design constraints into their  
work.</p>
</abstract>
<status>
-<p>This DRAFT document has been developed for discussion by the <loc
+<p>This document has been developed for discussion by the <loc
href="/2001/tag/">W3C Technical Architecture Group</loc> as a finding to
address the TAG issues
<loc href="http://www.w3.org/2001/tag/ 
ilist#contentTypeOverride-24">contentTypeOverride-24</loc>,
-<loc href="http://www.w3.org/2001/tag/ 
ilist#putMediaType-38">putMediaType-38</loc>, and portions of
+<loc href="http://www.w3.org/2001/tag/ 
ilist#putMediaType-38">putMediaType-38</loc>,
+<loc href="http://www.w3.org/2001/tag/ 
ilist.html#RFC3023Charset-21">RFC3023Charset-21</loc>,
+and portions of
<loc href="http://www.w3.org/2001/tag/ 
ilist.html#errorHandling-20">errorHandling-20</loc>.
It is an update to the
<loc href="http://www.w3.org/2004/02/23-tag- 
summary.html#contentTypeOverride-24">previously approved</loc>  
finding of
@@ -126,19 +129,20 @@
<p>The following are the key architectural points of this finding:</p>
<olist>
-<item><p>Representation metadata received in an encapsulating  
container,
-such as within the header fields of a message, is authoritative in  
defining
-the nature of the representation received.</p></item>
+<item><p>Metadata received in an encapsulating container, such as the
+metadata within the header fields of a message that describe the data
+enclosed within that message, is authoritative in defining
+the nature of the data received.</p></item>
<item><p>Inconsistency between representation data and metadata is an
error that should be discovered and corrected rather than silently
ignored.</p></item>
-<item><p>It is an error for an agent to ignore or override  
authoritative
-metadata without the consent of the party the agent represents.</p></ 
item>
+<item><p>An agent MUST NOT ignore or override authoritative
+metadata without the consent of the party employing the agent.</p></ 
item>
<item><p>Specifications MUST NOT work against the Web architecture
-by requiring or suggesting that a recipient override authoritatve
+by requiring or suggesting that a recipient override authoritative
metadata without user consent.</p></item>
</olist>
</div1>
@@ -154,7 +158,7 @@
<p>Metadata is simply defined as data about other data.
Metadata can be expressed while referencing data externally, while
-encapsuling data in a container, and by embedding metadata within the
+encapsulating data in a container, and by embedding metadata within the
data being described.  The following table provides examples of how
various forms of metadata can be expressed during Web interactions:</p>
@@ -231,13 +235,18 @@
various forms.  The representation media type <bibref ref="rfc2046"/>,
in particular, plays such an important role in the Web architecture  
that its
value can be described in many different locations.  Given multiple  
sources
-of metadata and the possibility that those sources may be  
inconsistent, the
-architect must decide what source of metadata has the highest  
priority and
-thus shall be considered authoritive in determining the desired  
behavior of the
+of metadata and the possibility that those sources may be  
inconsistent, an
+architect must decide what source of metadata has the highest  
priority and thus
+shall be considered authoritative in determining the desired  
behavior of the
recipient.  Furthermore, given the presence of self-descriptive data  
formats, a
decision must be made on whether to respect the declared metadata  
over whatever
might be learned by inspecting the data itself.</p>
+<p role="constraint">Metadata received in an encapsulating container  
MUST
+be considered authoritative and used in preference to metadata found by
+inspection of the data, declared by embedded metadata, or provided by
+external reference.</p>
+
<p>For Web architecture, a design choice has been made that metadata
received in an encapsulating container MUST be considered authoritative
and used in preference to metadata found by inspection of the data,
@@ -279,9 +288,10 @@
<div2 id="media-type">
<head>Role of Internet Media Types</head>
-<p>An Internet media type <bibref ref="rfc2046"/> is a short name, such
-as "text/html", that is associated with a data format specification and
-processing model through registration in the
+<p>An Internet media type <bibref ref="rfc2046"/> is metadata in the  
form
+of a short name (e.g., "text/html") that associates the data with a
+specific format specification and preferred interpretation. The  
association
+is formally accomplished through registration of the media type in the
<loc href="http://www.iana.org/assignments/media-types/index.html">IANA
media type registry</loc>.
For example, "text/html" in the IANA registry is associated with
@@ -292,13 +302,24 @@
the latest published version is [HTML401].</sitem>
</slist>
-<p>The media type indicates the intended processing model for a
-representation, including such issues as whether the data should be
-rendered, stored, or executed. In practice, media types are thus usable
-for selecting handlers to implement those functions.  A media type,  
therefore,
-is not simply an indication of data format; it also refers to a  
standardized
-interpretation of that data format.  In fact, many different media  
types
-share a single data format, while others represent a superset of  
formats.</p>
+<p>A media type is not simply an indication of data format; it also
+refers to a preferred interpretation of that data format.  This  
preferred
+interpretation may impact the recipient's functional decisions, such as
+whether the data is rendered, stored, or executed.  In practice,
+media types are often used as the key for selecting an appropriate
+handler to interpret the data received.  It is possible for a single
+data format to be associated with multiple media types and for a single
+media type to describe a superset of many different data formats.</p>
+
+<p>As explained above for representation metadata in general, we  
refer to
+the media type as describing the sender's preferred, intended, and
+definitive <emph>interpretation</emph> of the data, rather than as  
defining a
+specific processing model for the recipient.  Each agent will interpret
+received data according to its own function and configuration, perhaps
+informed by the media type, and all that is required for Web  
interaction
+is that the intention be faithfully communicated.  It is assumed  
that the
+recipient software will follow those intentions, when appropriate,  
to the
+extent that it has been instructed to do so by the agent's user.</p>
</div2>
@@ -307,17 +328,19 @@
<p>If the authoritative media type of a representation were to be  
determined
by inspection of embedded metadata in a self-descriptive format, then  
a sender
-could not choose different interpretations for a single representation
+could not indicate different interpretations for a single  
representation
based on the declared media type.  For example, an owner might want  
to provide
links to separate resources that differ only in how a given HTML  
representation
-should be rendered. A message containing the header field
-<code>Content-Type: text/html</code> would indicate that standard HTML
-processing is desired, whereas the header field <code>Content-Type:  
text/plain</code>
-would indicate that the data should be viewed as plain text without
-HTML rendering.  Since the representation data in both messages are
-identical, this functionality is only possible if metadata of the
-containing message is considered more authoritative in describing  
the data
-than whatever could be learned from inspection of the data itself.</p>
+is intended to be rendered. A message containing the header field
+<code>Content-Type: text/html</code> would indicate that the sender  
intends
+the recipient to interpret the representation as hypertext, using  
the rendering
+process defined by the HTML standard, whereas the header field
+<code>Content-Type: text/plain</code> would indicate that the sender  
intends
+the recipient to treat the data as plain text without HTML rendering.
+Since the representation data is the same in both messages, this
+functionality is only possible if metadata of the containing message is
+considered more authoritative in describing the data than whatever  
could be
+learned from inspection of the data itself.</p>
<p>Placing authoritative metadata in message fields also enables more
efficient processing of messages.  It is far easier to dispatch behavior
@@ -385,17 +408,17 @@
<head>Overriding authoritative metadata</head>
<p>Recognition of authoritative metadata is important because it
-influences the default processing behavior for Web interactions.   
However,
+influences the default behavior for Web interactions.  However,
representation metadata is also susceptible to misconfiguration, and
user agents frequently try to &quot;simplify&quot; the Web by  
automatically
&quot;correcting&quot; perceived &quot;errors&quot; in those  
configurations.
</p>
-<p>Choosing to ignore or override authoritative metadata is only  
allowed
-within the Web architecture when the user has given consent.
-Recipients SHOULD detect inconsistencies between representation
-data and metadata but MUST NOT resolve them without the
-<loc href="#consent">consent of the user</loc>.</p>
+<p>Recipients SHOULD detect inconsistencies between representation data
+and metadata but MUST NOT resolve them without the
+<loc href="#consent">consent of the user</loc>.
+Choosing to ignore or override authoritative metadata is only allowed
+within the Web architecture when the user has given consent.</p>
<div2 id="inconsistency">
<head>Inconsistency between representation data and metadata</head>
@@ -404,8 +427,12 @@
from data, there are risks as well. In particular, the resource owner
may create inconsistencies by misconfiguring resources or by failing to
reassign metadata after a change of representation.
-Inconsistency between representation data and metadata is an error.
-Examples of inconsistencies between metadata and representation data
+Inconsistency between representation data and metadata is an error.</p>
+
+<p role="practice">Recipients SHOULD detect inconsistencies between
+representation data and metadata.</p>
+
+<p>Examples of inconsistencies between metadata and representation data
that are frequently observed on the Web include:</p>
<ulist>
@@ -427,36 +454,63 @@
<p>Web software developers, webmasters, and resource owners can help
reduce inconsistency through careful assignment of representation  
metadata.
-In particular:</p>
+</p>
-<ulist>
-<item><p>Server software designers SHOULD NOT specify default  
representation
-metadata, such as media type, character encoding, or content language,
-within the standard configuration shipped with the server.</p></item>
-
-<item><p>Server software designers SHOULD provide a means to set  
representation
-metadata at the same level of granularity and permission that is needed
-to author those representations.</p></item>
-
-<item><p>Server managers SHOULD NOT specify an arbitrary Internet
-media type (e.g., "text/plain" or "application/octet-stream") when the
-representation media type is unknown.</p></item>
-
-<item><p>Server managers SHOULD provide each author with the means and
-permission to set the configuration of metadata for any representations
-under the author's control.</p></item>
-
-<item><p>Resource owners SHOULD test for correct metadata and
-inform server managers of metadata misconfigurations.</p></item>
-
-<item><p>Authoritative metadata SHOULD NOT be provided external to the
-representation if it does not add clarity to that communication.
-For example, the character encoding of XML data formats is self- 
descriptive
+<p role="practice">Server software designers (implementers) SHOULD  
provide
+a means to set representation metadata at the same level of  
granularity and
+permission that is needed to author those representations.</p>
+
+<p>Metadata configuration needs to be authored by the same people  
who have
+the ability to change the data being described. If all of the  
authoring is
+done by the webmaster, then it makes sense to have one central  
location for
+defining the metadata configuration.  In contrast, if the right to  
author
+representations has been delegated, such as through varying  
ownership within
+the server's hierarchical URI space, then the ability to author  
metadata
+configuration should be delegated as well.</p>
+
+<p role="practice">Server managers (webmasters) SHOULD provide each  
resource
+owner (author) with the means and permission to set the  
configuration of
+metadata for any representations under the author's control.</p>
+
+<p>For example, the Apache httpd has a configuration directive,
+<loc href="http://httpd.apache.org/docs/2.2/mod/ 
core.html#allowoverride">AllowOverride FileInfo</loc>,
+which delegates the authority to define metadata to the owners of each
+directory.  It follows, therefore, that "AllowOverride FileInfo"  
should be set
+for any directory containing representations that are authored by  
people who
+do not have permission to change the central server configuration.</p>
+
+<p role="practice">Resource owners (authors) SHOULD test for correct  
metadata
+and inform server managers of metadata misconfigurations.</p>
+
+<p>This requires that authors be able to detect errors, which will be
+discussed below.</p>
+
+<p role="practice">Server software designers (implementers) SHOULD  
NOT specify
+default representation metadata, such as media type, character  
encoding, or
+content language, within the standard configuration shipped with the  
server.
+</p>
+
+<p>Instead of specifying a default for metadata, it is better for
+representations to be sent without that metadata.  That allows the  
recipient
+to guess the metadata instead of being forced to either accept  
incorrect
+metadata or be tempted to violate Web architecture by ignoring it.</p>
+
+<p role="practice">Server managers (webmasters) SHOULD NOT specify  
an arbitrary
+Internet media type (e.g., "text/plain" or "application/octet- 
stream") when the
+media type is unknown.</p>
+
+<p>It is better to send no media type if the resource owner has  
failed to
+define one for a given representation.</p>
+
+<p role="practice">Authoritative metadata SHOULD NOT be provided  
external to
+the representation if it does not add clarity to that  
communication.</p>
+
+<p>For example, the character encoding of XML data formats is self- 
descriptive
within the data and SHOULD NOT be included in a charset parameter of the
media type unless that distinction is significant to the resource  
(e.g., for
comparison during content negotiation of multiple XML representations
-that differ only by character encoding).</p></item>
-</ulist>
+that differ only by character encoding).</p>
+
</div2>
<div2 id="silent-recovery">
@@ -468,13 +522,16 @@
from error perpetuates what could be easily fixed if the resource owner
is simply informed of that error during their own testing of the  
resource.</p>
-<p>Web agents SHOULD have a configuration option that enables
-the display or logging of detected errors. Such a display need not be
-disruptive of the user experience; for example, a graphical browser
-might display a small "bug" button in the user interface to indicate a
-detected error so that an interested user (i.e., the resource owner)
-can select the button, inspect the error, and perhaps modify the
-agent's choice on how to recover from that error.</p>
+<p role="practice">Web agents SHOULD have a configuration option  
that enables
+the display or logging of detected errors.</p>
+
+<p>Revealing errors when they occur need not be disruptive of the user
+experience. For example, a graphical browser might display a small  
"bug"
+button in the user interface to indicate a detected error so that an
+interested user (i.e., the resource owner) can select the button,  
inspect
+the error, and perhaps modify the agent's choice on how to recover from
+that error.  Naturally, the appropriate mechanism will be unique to  
each type
+of receiving agent and application context.</p>
<p>Some applications of the Web cannot tolerate error.  For example,
medical information systems must be designed so as to detect errors that
@@ -500,9 +557,8 @@
agent violates those expectations, it violates the protections that may
have been put in place for the user's self-protection.</p>
-<p>Because of those risks, it is an error for an agent to ignore or
-override authoritative metadata without the consent of the party
-employing the agent.</p>
+<p role="constraint">An agent MUST NOT ignore or override authoritative
+metadata without the consent of the party employing the agent.</p>
<p>Consent does not imply that the receiving agent must interrupt
the user and require selection of one option or another.
@@ -515,6 +571,16 @@
errors and ways in which interface designers might obtain user
feedback to address them.</p>
+<p>Likewise, consent may be implied by the nature or type of  
interaction
+being performed by the agent.  For example, a script that "mirrors"  
content
+from the Web into files on an FTP server is probably going to ignore
+metadata.  Similarly, XInclude <bibref ref="XInclude"/> processing  
has the
+implied consent of the user to transform data from one source to  
another
+and thus should only result in errors when the transformation is  
unsuccessful.
+Note, however, that this functionality imposes a social burden on  
XInclude
+processors to be sure that the resulting composed document does not  
violate
+the user's security constraints.</p>
+
</div2>
</div1>
@@ -538,38 +604,19 @@
that gives clients a hint about the likely media type if one were to
retrieve a representation of the identified resource.</p>
-<example>
-<head>Format specifications cannot redefine authoritative metadata</ 
head>
-
-<p>The MyFormat specification specifies a <code>type</code> attribute
-with external references that supposedly takes precedence over any  
other
-media type received as authoritative metadata. When <code>type</ 
code> is
-present, receiving agents are instructed to use its value and ignore
-any conflicting metadata provided by the sender.</p>
-
-<p>The MyFormat specification designers rationale for this departure  
from
-Web architecture is that such a definition of the <code>type</code>  
attribute
-allows content authors to work around misconfigured servers. They  
contend
-that this is necessary because, in many environments, content  
authors may
-not have sufficient access to the server configuration to assign the
-correct media type where it belongs.</p>
-
-<p>Should the MyFormat specification designers be allowed to ignore a
-principle of Web architecture and define <code>type</code> in this way
-just to remedy a potential configuration problem?</p>
-
-<p>Answer: Errors involving inconsistent metadata cannot be "fixed"
-by adding metadata to external references --- the metadata is  
inconsistent
-for all recipients of the message, not just the user agent.
-An agent that silently overrides server-provided metadata can create
-security risks and prevent errors from being detected and  
corrected.</p>
-</example>
+<p role="constraint">Specifications MUST NOT work against the Web  
architecture
+by requiring or suggesting that a recipient override authoritative  
metadata
+without user consent.</p>
<p>A format specification that includes metadata hints for clients
must make clear that, when these hints interact with server metadata,
-they are advisory only. Format specifications MUST NOT include
-requirements for clients to override server metadata without user
-consent.</p>
+they are advisory only. These hints provide metadata by external  
reference
+and thus will not be known to all of the other (intermediary)  
recipients
+of the representation. Errors involving inconsistent metadata cannot be
+"fixed" by adding metadata to external references, since the metadata
+is inconsistent for all recipients of the message (not just the user  
agent).
+An agent that silently overrides server-provided metadata can create
+security risks and prevent errors from being detected and  
corrected.</p>
<p>An architecturally sound description of an advisory attribute might
read:</p>
@@ -598,11 +645,12 @@
attribute in
<loc href="http://www.w3.org/TR/2005/REC-SMIL2-20050107/extended- 
media-object.html#adef-media-type">section 7.3.1</loc>
specifies that the value of <code>type</code> takes precedence over
-authoritative metadata for some protocols.  The specification is in  
error.
+authoritative metadata for some protocols.  That specification is in  
error.
Under no circumstances can a format specification change the meaning of
protocol interaction on the Web. Implementers MUST disregard that  
statement
in SMIL 2.0 and treat the type attribute as merely a means for
-content selection or for when authoritative metadata is  
unavailable.</p>
+content selection or for when authoritative metadata is unavailable.
+The error has been corrected in SMIL 2.1 <bibref ref="SMIL21"/>.</p>
</div1>
@@ -627,23 +675,24 @@
according to the HTML and CSS specifications.
</p>
-<p>Which party has neglected a principle of Web architecture: Stuart
+<p>Which party has neglected a constraint of Web architecture: Stuart
for the server misconfiguration, Tim's browser for silently overriding
the HTTP headers from the server, or Janet's browser for not detecting
that the content looked like HTML?</p>
-<p>Answer: By silently overriding metadata from the representation
-provider in the HTTP headers, Tim's browser did not respect Web
-architecture principles that promote shared understanding and
-security.</p>
-
-<p>Misconfiguration of the server is a fixable error.  If Stuart was
-using Janet's browser, he would see that error immediately and fix it.
-However, if Stuart uses the same browser as Tim for his testing, Stuart
-would not be informed of the error.  Tim's browser is the culprit  
here because
-it misrepresents the resource owner by ignoring the authoritative  
metadata
-without Tim's consent. Janet's browser respected the "Content-Type"  
header
-field and, in doing so, helps Janet detect a server  
misconfiguration.</p>
+<p>Answer: By silently overriding the authoritative metadata from the
+HTTP headers, Tim's browser did not respect Web architecture  
constraints
+that promote shared understanding and security.</p>
+
+<p>Misconfiguration of the server is a fixable error.  If Stuart had  
been
+using Janet's browser to test, he would have seen the error  
immediately and
+fixed it long before either Tim or Janet made their requests.   
However, if
+Stuart used the same browser as Tim for his testing, Stuart would  
not have
+been informed of the error.  The software developers of Tim's  
browser are
+the culprit here because the product misrepresents the resource  
owner by
+ignoring the authoritative metadata without Tim's consent. Janet's  
browser
+respected the "Content-Type" header field and, in doing so, helps Janet
+detect a server misconfiguration.</p>
</div2>
@@ -660,14 +709,13 @@
executes it, promptly sending a rude message to everyone on
Tim's address list (including Tim's mom).</p>
-<p>Which party has neglected a principle of Web architecture: Stuart
+<p>Which party has neglected a constraint of Web architecture: Stuart
for serving content about a vulnerability or Tim's browser for silently
overriding the HTTP headers from the server?</p>
-<p>Answer: By silently overriding metadata from the representation
-provider in the HTTP headers, Tim's browser did not respect Web
-architecture principles that promote shared understanding and
-security.</p>
+<p>Answer: By silently overriding the authoritative metadata from the
+HTTP headers, Tim's browser did not respect Web architecture  
constraints
+that promote shared understanding and security.</p>
<p>Authoritative metadata is an important aspect of Web architecture.
Agents that ignore authoritative metadata are broken, sometimes
@@ -677,7 +725,7 @@
</div2>
<div2 id="hint-scenario">
-<head>Misconfiguration and metadata hints</head>
+<head>Inconsistent metadata hints</head>
<p>Norm publishes an XHTML document that includes this link:</p>
@@ -687,18 +735,17 @@
<p>Although the link refers to an XSLT style sheet, Norm has set the
<code>type</code> attribute to "text/css". Stuart has configured the
-Web server so that the style sheet is served via HTTP/1.1 as
-"application/xslt+xml". With a user agent that understands XSLT but
-not CSS, Janet requests the content that includes this link. As it
-interprets the representation data, Janet's user agent reads the
-<code>type</code> hint and does not fetch the style sheet."</p>
+Web server so that representations of the resource "cool-style" are
+served via HTTP/1.1 as "application/xslt+xml". With a user agent that
+understands XSLT but not CSS, Janet requests the content that includes
+this link. As it interprets the representation data, Janet's user agent
+reads the <code>type</code> hint and does not fetch the style  
sheet.</p>
<p>Which party is responsible for the fact that Janet did not receive
content she should have: Stuart for the server configuration, Norm for
-stating that the style sheet is served as "text/css"
-when in fact it's served with a different media type, or Janet's
-user agent for not double-checking the media type with the
-server?</p>
+stating that the style sheet is served as "text/css" when in fact it's
+served with a different media type, or Janet's user agent for not
+double-checking the media type with the server?</p>
<p>Answer: Norm's mislabeling of content deprived
Janet of content she should have received.</p>
@@ -709,23 +756,20 @@
responsibility to manage the risk that it may become inconsistent with
the content available at the link target address.</quote> Janet's
client could have done more than merely read the <code>type</code>
-hint and decide to skip the style sheet. Users benefit from clients
-that allow different configurations for handling hints, including:</p>
+hint and decide to skip the style sheet, but the specific purpose of
+that hint is to reduce unnecessary requests and the associated  
latency.</p>
-<ulist>
-<item><p>Query the server, and when there is an inconsistency,
-choose the authoritative metadata, or</p></item>
-<item><p>Query the server, and when there is an inconsistency, prompt
-the user for instructions on how to proceed.</p></item>
-</ulist>
+<p>Users often benefit from agents that perform metadata consistency
+checks as part of special "authoring" or "testing" modes.  Such checks
+might query the server and check for inconsistency, thus allowing the
+metadata to be tested by authors without incurring overhead during
+operation by normal users.</p>
</div2>
<div2 id="dav-scenario">
<head>Conflicting metadata during distributed authoring</head>
-<p>[unfinished]</p>
-
<p>The meaning of any HTTP message is defined by the contents of that
message as interpreted according to the HTTP standard.  If a client  
requests
that a server store a representation at a given URI and the server's
@@ -733,7 +777,7 @@
from what has been provided by the client, then the server should reject
the request using an appropriate HTTP status code.</p>
-<p>In other words, if a webdav client performs a</p>
+<p>For example, if a WebDAV client performs a</p>
<eg><![CDATA[
     PUT /something.html HTTP/1.1
@@ -744,45 +788,63 @@
<p>and example.org knows that it has been configured such that all
resources with identifiers ending in in ".html" are represented
-in the "text/html" format, then the server has four choices:</p>
+in the "text/html" format (i.e., the server has been configured not
+to simply accept whatever the client wants for any given identifier),
+then the server could choose one of four potential choices for handling
+the request:</p>
<olist>
-<item><p>ignore the "application/pdf" metadata provided by the  
client, store the
-representation as-is, and serve it later as "text/html".</p></item>
+<item><p>ignore the "application/pdf" metadata provided by the  
client, store
+the representation as-is, and serve it later as "text/html".</p></item>
<item><p>change the configuration such that future 200 responses to
<code>GET /something.html</code> will be served as "application/pdf",
thus preserving the client's stated intent.</p></item>
<item><p>accept the request only in the sense of it being a requested  
change of
-resource state, resulting in the PDF representation being  
"converted" to HTML
-for later responses.</p></item>
+resource state, meaning that the PDF representation is automatically  
converted
+to HTML for use by later responses.</p></item>
<item><p>respond with "415 Unsupported Media Type" and a message  
stating why the
request is inconsistent with the resource.</p></item>
</olist>
-<p>(1) is clearly a bad idea because the inconsistency is an error
-and failing to report an error is bad design.</p>
-
-<p>(2) may be feasible on some HTTP servers that combine configuration
-for both authoring and read-only services, but most production HTTP
-servers do not work that way, and automatically overriding a server
-configuration is more likely to hide pilot-error rather than do what
-the user actually wants.</p>
-
-<p>(3) is a complicated option that preserves REST semantics but not
-those of a dumb filesystem.  It is one of those server-side magic
-tricks that tends to annoy people who think HTTP is a file protocol,
-which suits me just fine provided that it isn't mandatory.</p>
-
-<p>(4) properly informs the user of the inconsistency (enabling them
-to choose the right workaround), works in all cases, but wastes
-some bandwidth.</p>
-
-<p>Answer: (1) is a bug, (2) is bad implementation,
-(3) is a nifty feature when the user is making an informed
-request, and (4) is the right answer in all other cases.</p>
+<p>Ignoring the "application/pdf" metadata provided by the client  
(1) is
+clearly a bad idea because the inconsistency is an error and failing to
+report an error is bad design.</p>
+
+<p>Automatically changing the conflicting configuration (2) is  
appropriate
+if and only if the author has the ability to selectively override  
the server's
+configuration on a per-representation basis, has configured their  
Web space
+to do so, and the result of accepting the PUT does change that  
configuration.
+The primary use-case for this style of override is to continue  
supporting
+well-known "cool" URIs even though the identifier appears to contain  
metadata
+that is inconsistent with the current media type.  A better  
solution, though,
+is to simply redirect the old identifier to a new URI that does not  
contain
+an apparent file extension.  Unfortunately, the main problem with  
accepting
+the override is that the inconsistency may have been due to pilot error
+rather than user intention.  A good rule of thumb is to provide this  
behavior
+as a configuration option that is not the default.</p>
+
+<p>Performing on-the-fly type conversion, as in (3), is a  
complicated option
+that preserves Web semantics but can lead to unexpected results for  
authors
+that consider the Web interface to be just another dumb filesystem.   
This
+should only be done when the resource owner has specifically  
configured the
+resource (or space of resources) to process state changes in this  
manner.
+A better solution is to redirect the user to a codependent resource  
that
+provides "application/pdf" views of the shared state; the user can then
+choose whether or not to apply the state change to that resource, which
+will have a metadata configuration consistent with the representation
+being PUT, and thus preserve both Web and filesystem semantics.</p>
+
+<p>Responding with a "415 Unsupported Media Type" error (4) is, in most
+cases, the right answer unless the server has been specifically  
configured
+for options (2) or (3).  Although it costs time and bandwidth,  
responding
+with an informative error message allows the user to inspect both the
+request being made and the server's current configuration, change
+whichever one is incorrect, and thereby establish the correct metadata
+for the resource's representations <emph>before</emph> allowing the PUT
+to succeed.</p>
</div2>
</div1>
@@ -792,7 +854,12 @@
<p>The TAG is working with the authors of <bibref ref="rfc3023"/>
to revise section 7.1 of that RFC, which suggests behavior regarding
-character encoding metadata that is inconsistent with this finding.</p>
+character encoding metadata that is inconsistent with this finding.
+More information on that issue
+(<loc href="http://www.w3.org/2001/tag/ 
ilist.html#RFC3023Charset-21">RFC3023Charset-21</loc>)
+can be found in the TAG finding on
+<loc href="http://www.w3.org/2001/tag/2002/0129-mime">Internet Media  
Type
+registration, consistency of use</loc> <bibref ref="TAG-21"/>.</p>
</div1>
@@ -805,7 +872,8 @@
Assigned Numbers Authority (IANA)</bibl>
<bibl id="rfc1866" href="http://www.ietf.org/rfc/rfc1866"
-key="RFC1866">T. Berners-Lee, D. Connolly. <titleref>Hypertext  
Markup Language - 2.0</titleref>, RFC1866, November 1995.</bibl>
+key="RFC1866">T. Berners-Lee, D. Connolly. <titleref>Hypertext Markup
+Language - 2.0</titleref>, RFC1866, November 1995.</bibl>
<bibl id="rfc2046" href="http://www.ietf.org/rfc/rfc2046.txt"
key="RFC2046">N. Freed, N. Borenstein. <titleref>Multipurpose Internet
@@ -827,17 +895,29 @@
<bibl id="rfc3023" href="http://www.ietf.org/rfc/rfc3023.txt"
key="RFC3023">M. Murata, S. St. Laurent, D. Kohn. <titleref>XML Media
-      Types</titleref>, RFC3023, January 2001.</bibl>
+Types</titleref>, RFC3023, January 2001.</bibl>
<bibl id="SMIL20" href="http://www.w3.org/TR/2005/REC-SMIL2-20050107/"
key="SMIL20">J. Ayars et al. <titleref>Synchronized Multimedia  
Integration
Language (SMIL 2.0), Second Edition</titleref>, W3C Recommendation, 7  
January
2005.</bibl>
+<bibl id="SMIL21" href="http://www.w3.org/TR/2005/REC-SMIL2-20051213/"
+key="SMIL21">D. Bulterman et al., eds. <titleref>Synchronized  
Multimedia Integration
+Language (SMIL 2.1)</titleref>, W3C Recommendation, 13 December  
2005.</bibl>
+
<bibl id="SRGS10" href="http://www.w3.org/TR/2004/REC-speech- 
grammar-20040316/"
-key="SRGS10">A. Hunt, S. McGlashan eds. <titleref>Speech Recognition  
Grammar
+key="SRGS10">A. Hunt, S. McGlashan, eds. <titleref>Speech  
Recognition Grammar
Specification Version 1.0</titleref>, W3C Recommendation, 16 March  
2004.</bibl>
+<bibl id="TAG-21" href="http://www.w3.org/2001/tag/2002/0129-mime"
+key="TAG-21">T. Bray, ed. <titleref>Internet Media Type registration,
+consistency of use</titleref>, W3C TAG Finding, 4 September 2002.</ 
bibl>
+
+<bibl id="XInclude" href="http://www.w3.org/TR/xinclude/"
+key="XInclude">J. Marsh, D. Orchard, eds. <titleref>XML Inclusions  
(XInclude)
+Version 1.0</titleref>, W3C Recommendation, 20 December 2004.</bibl>
+
</blist>
</div1>
@@ -849,7 +929,8 @@
included substantial input from Roy T. Fielding, Stuart Williams, and
Dan Connolly.  Martin Dürst, Philipp Hoschka, Rob Lanphier, and  
Norman Walsh
provided reviews of prior drafts that improved this finding. This second
-edition has additionally benefited from the comments of Noah  
Mendelsohn.</p>
+edition has additionally benefited from the comments of Noah  
Mendelsohn,
+Mark Baker, and Julian Reschke.</p>
</div1>
Received on Wednesday, 8 March 2006 02:25:12 UTC