Re: draft finding on Authoritative Metadata

I've just reread the whole thing and it (still) looks fine to me.

> This completes my final two action items as a TAG member.

...and rereading this reminds me how much you'll be missed.  Truly nice 
work.

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








"Roy T. Fielding" <fielding@gbiv.com>
Sent by: www-tag-request@w3.org
03/07/2006 09:24 PM
 
        To:     W3C TAG <www-tag@w3.org>
        cc:     (bcc: Noah Mendelsohn/Cambridge/IBM)
        Subject:        Re: draft finding on Authoritative Metadata



I have updated the finding on Authoritative Metadata to reflect all of
the comments received and finished the section on distributed authoring.

This version:
     http://www.w3.org/2001/tag/doc/mime-respect-20060307
Latest version:
     http://www.w3.org/2001/tag/doc/mime-respect

I also took the liberty to reformat it a bit to use the style of
constraints found in webarch and the "rule of least power" finding.
A content-only diff from the 5 Dec 2005 version is enclosed below.

Note that the TAG has not yet approved this version as a finding,
though I expect that will happen at the next teleconference.

This completes my final two action items as a TAG member.

Have fun,

Roy T. Fielding                            <http://roy.gbiv.com/>
Chief Scientist, Day Software              <http://www.day.com/>

===================================================================

diff -u -r1.52 mime-respect.xml
--- mime-respect.xml             5 Dec 2005 08:19:42 -0000 1.52
+++ mime-respect.xml             8 Mar 2006 01:49:31 -0000
@@ -68,19 +68,22 @@
media type of the representation, which influences the dispatching
of handlers and security-related decisions made by recipients of the
message.  In this finding, we review the architectural design choice
-that metadata provided in a received message be considered 
authoritative.
-We examine why recipient behavior that fails to respect authoritative
-metadata can be harmful and under what conditions (user consent) such
-behavior is allowed.  Finally, we consider how specification authors
-should incorporate these design constraints into their work.</p>
+that metadata provided in an encapsulating container, such as the 
metadata
+provided in the header fields of a received message, be considered
+authoritative.  We examine why recipient behavior that fails to respect
+authoritative metadata can be harmful and under what conditions such
+behavior is allowed.  Finally, we consider how specification authors 
and
+implementers should incorporate these design constraints into their 
work.</p>
</abstract>
<status>
-<p>This DRAFT document has been developed for discussion by the <loc
+<p>This document has been developed for discussion by the <loc
href="/2001/tag/">W3C Technical Architecture Group</loc> as a finding to
address the TAG issues
<loc href="http://www.w3.org/2001/tag/ 
ilist#contentTypeOverride-24">contentTypeOverride-24</loc>,
-<loc href="http://www.w3.org/2001/tag/ 
ilist#putMediaType-38">putMediaType-38</loc>, and portions of
+<loc href="http://www.w3.org/2001/tag/ 
ilist#putMediaType-38">putMediaType-38</loc>,
+<loc href="http://www.w3.org/2001/tag/ 
ilist.html#RFC3023Charset-21">RFC3023Charset-21</loc>,
+and portions of
<loc href="http://www.w3.org/2001/tag/ 
ilist.html#errorHandling-20">errorHandling-20</loc>.
It is an update to the
<loc href="http://www.w3.org/2004/02/23-tag- 
summary.html#contentTypeOverride-24">previously approved</loc> 
finding of
@@ -126,19 +129,20 @@
<p>The following are the key architectural points of this finding:</p>
<olist>
-<item><p>Representation metadata received in an encapsulating 
container,
-such as within the header fields of a message, is authoritative in 
defining
-the nature of the representation received.</p></item>
+<item><p>Metadata received in an encapsulating container, such as the
+metadata within the header fields of a message that describe the data
+enclosed within that message, is authoritative in defining
+the nature of the data received.</p></item>
<item><p>Inconsistency between representation data and metadata is an
error that should be discovered and corrected rather than silently
ignored.</p></item>
-<item><p>It is an error for an agent to ignore or override 
authoritative
-metadata without the consent of the party the agent represents.</p></ 
item>
+<item><p>An agent MUST NOT ignore or override authoritative
+metadata without the consent of the party employing the agent.</p></ 
item>
<item><p>Specifications MUST NOT work against the Web architecture
-by requiring or suggesting that a recipient override authoritatve
+by requiring or suggesting that a recipient override authoritative
metadata without user consent.</p></item>
</olist>
</div1>
@@ -154,7 +158,7 @@
<p>Metadata is simply defined as data about other data.
Metadata can be expressed while referencing data externally, while
-encapsuling data in a container, and by embedding metadata within the
+encapsulating data in a container, and by embedding metadata within the
data being described.  The following table provides examples of how
various forms of metadata can be expressed during Web interactions:</p>
@@ -231,13 +235,18 @@
various forms.  The representation media type <bibref ref="rfc2046"/>,
in particular, plays such an important role in the Web architecture 
that its
value can be described in many different locations.  Given multiple 
sources
-of metadata and the possibility that those sources may be 
inconsistent, the
-architect must decide what source of metadata has the highest 
priority and
-thus shall be considered authoritive in determining the desired 
behavior of the
+of metadata and the possibility that those sources may be 
inconsistent, an
+architect must decide what source of metadata has the highest 
priority and thus
+shall be considered authoritative in determining the desired 
behavior of the
recipient.  Furthermore, given the presence of self-descriptive data 
formats, a
decision must be made on whether to respect the declared metadata 
over whatever
might be learned by inspecting the data itself.</p>
+<p role="constraint">Metadata received in an encapsulating container 
MUST
+be considered authoritative and used in preference to metadata found by
+inspection of the data, declared by embedded metadata, or provided by
+external reference.</p>
+
<p>For Web architecture, a design choice has been made that metadata
received in an encapsulating container MUST be considered authoritative
and used in preference to metadata found by inspection of the data,
@@ -279,9 +288,10 @@
<div2 id="media-type">
<head>Role of Internet Media Types</head>
-<p>An Internet media type <bibref ref="rfc2046"/> is a short name, such
-as "text/html", that is associated with a data format specification and
-processing model through registration in the
+<p>An Internet media type <bibref ref="rfc2046"/> is metadata in the 
form
+of a short name (e.g., "text/html") that associates the data with a
+specific format specification and preferred interpretation. The 
association
+is formally accomplished through registration of the media type in the
<loc href="http://www.iana.org/assignments/media-types/index.html">IANA
media type registry</loc>.
For example, "text/html" in the IANA registry is associated with
@@ -292,13 +302,24 @@
the latest published version is [HTML401].</sitem>
</slist>
-<p>The media type indicates the intended processing model for a
-representation, including such issues as whether the data should be
-rendered, stored, or executed. In practice, media types are thus usable
-for selecting handlers to implement those functions.  A media type, 
therefore,
-is not simply an indication of data format; it also refers to a 
standardized
-interpretation of that data format.  In fact, many different media 
types
-share a single data format, while others represent a superset of 
formats.</p>
+<p>A media type is not simply an indication of data format; it also
+refers to a preferred interpretation of that data format.  This 
preferred
+interpretation may impact the recipient's functional decisions, such as
+whether the data is rendered, stored, or executed.  In practice,
+media types are often used as the key for selecting an appropriate
+handler to interpret the data received.  It is possible for a single
+data format to be associated with multiple media types and for a single
+media type to describe a superset of many different data formats.</p>
+
+<p>As explained above for representation metadata in general, we 
refer to
+the media type as describing the sender's preferred, intended, and
+definitive <emph>interpretation</emph> of the data, rather than as 
defining a
+specific processing model for the recipient.  Each agent will interpret
+received data according to its own function and configuration, perhaps
+informed by the media type, and all that is required for Web 
interaction
+is that the intention be faithfully communicated.  It is assumed 
that the
+recipient software will follow those intentions, when appropriate, 
to the
+extent that it has been instructed to do so by the agent's user.</p>
</div2>
@@ -307,17 +328,19 @@
<p>If the authoritative media type of a representation were to be 
determined
by inspection of embedded metadata in a self-descriptive format, then 
a sender
-could not choose different interpretations for a single representation
+could not indicate different interpretations for a single 
representation
based on the declared media type.  For example, an owner might want 
to provide
links to separate resources that differ only in how a given HTML 
representation
-should be rendered. A message containing the header field
-<code>Content-Type: text/html</code> would indicate that standard HTML
-processing is desired, whereas the header field <code>Content-Type: 
text/plain</code>
-would indicate that the data should be viewed as plain text without
-HTML rendering.  Since the representation data in both messages are
-identical, this functionality is only possible if metadata of the
-containing message is considered more authoritative in describing 
the data
-than whatever could be learned from inspection of the data itself.</p>
+is intended to be rendered. A message containing the header field
+<code>Content-Type: text/html</code> would indicate that the sender 
intends
+the recipient to interpret the representation as hypertext, using 
the rendering
+process defined by the HTML standard, whereas the header field
+<code>Content-Type: text/plain</code> would indicate that the sender 
intends
+the recipient to treat the data as plain text without HTML rendering.
+Since the representation data is the same in both messages, this
+functionality is only possible if metadata of the containing message is
+considered more authoritative in describing the data than whatever 
could be
+learned from inspection of the data itself.</p>
<p>Placing authoritative metadata in message fields also enables more
efficient processing of messages.  It is far easier to dispatch behavior
@@ -385,17 +408,17 @@
<head>Overriding authoritative metadata</head>
<p>Recognition of authoritative metadata is important because it
-influences the default processing behavior for Web interactions. 
However,
+influences the default behavior for Web interactions.  However,
representation metadata is also susceptible to misconfiguration, and
user agents frequently try to &quot;simplify&quot; the Web by 
automatically
&quot;correcting&quot; perceived &quot;errors&quot; in those 
configurations.
</p>
-<p>Choosing to ignore or override authoritative metadata is only 
allowed
-within the Web architecture when the user has given consent.
-Recipients SHOULD detect inconsistencies between representation
-data and metadata but MUST NOT resolve them without the
-<loc href="#consent">consent of the user</loc>.</p>
+<p>Recipients SHOULD detect inconsistencies between representation data
+and metadata but MUST NOT resolve them without the
+<loc href="#consent">consent of the user</loc>.
+Choosing to ignore or override authoritative metadata is only allowed
+within the Web architecture when the user has given consent.</p>
<div2 id="inconsistency">
<head>Inconsistency between representation data and metadata</head>
@@ -404,8 +427,12 @@
from data, there are risks as well. In particular, the resource owner
may create inconsistencies by misconfiguring resources or by failing to
reassign metadata after a change of representation.
-Inconsistency between representation data and metadata is an error.
-Examples of inconsistencies between metadata and representation data
+Inconsistency between representation data and metadata is an error.</p>
+
+<p role="practice">Recipients SHOULD detect inconsistencies between
+representation data and metadata.</p>
+
+<p>Examples of inconsistencies between metadata and representation data
that are frequently observed on the Web include:</p>
<ulist>
@@ -427,36 +454,63 @@
<p>Web software developers, webmasters, and resource owners can help
reduce inconsistency through careful assignment of representation 
metadata.
-In particular:</p>
+</p>
-<ulist>
-<item><p>Server software designers SHOULD NOT specify default 
representation
-metadata, such as media type, character encoding, or content language,
-within the standard configuration shipped with the server.</p></item>
-
-<item><p>Server software designers SHOULD provide a means to set 
representation
-metadata at the same level of granularity and permission that is needed
-to author those representations.</p></item>
-
-<item><p>Server managers SHOULD NOT specify an arbitrary Internet
-media type (e.g., "text/plain" or "application/octet-stream") when the
-representation media type is unknown.</p></item>
-
-<item><p>Server managers SHOULD provide each author with the means and
-permission to set the configuration of metadata for any representations
-under the author's control.</p></item>
-
-<item><p>Resource owners SHOULD test for correct metadata and
-inform server managers of metadata misconfigurations.</p></item>
-
-<item><p>Authoritative metadata SHOULD NOT be provided external to the
-representation if it does not add clarity to that communication.
-For example, the character encoding of XML data formats is self- 
descriptive
+<p role="practice">Server software designers (implementers) SHOULD 
provide
+a means to set representation metadata at the same level of 
granularity and
+permission that is needed to author those representations.</p>
+
+<p>Metadata configuration needs to be authored by the same people 
who have
+the ability to change the data being described. If all of the 
authoring is
+done by the webmaster, then it makes sense to have one central 
location for
+defining the metadata configuration.  In contrast, if the right to 
author
+representations has been delegated, such as through varying 
ownership within
+the server's hierarchical URI space, then the ability to author 
metadata
+configuration should be delegated as well.</p>
+
+<p role="practice">Server managers (webmasters) SHOULD provide each 
resource
+owner (author) with the means and permission to set the 
configuration of
+metadata for any representations under the author's control.</p>
+
+<p>For example, the Apache httpd has a configuration directive,
+<loc href="http://httpd.apache.org/docs/2.2/mod/ 
core.html#allowoverride">AllowOverride FileInfo</loc>,
+which delegates the authority to define metadata to the owners of each
+directory.  It follows, therefore, that "AllowOverride FileInfo" 
should be set
+for any directory containing representations that are authored by 
people who
+do not have permission to change the central server configuration.</p>
+
+<p role="practice">Resource owners (authors) SHOULD test for correct 
metadata
+and inform server managers of metadata misconfigurations.</p>
+
+<p>This requires that authors be able to detect errors, which will be
+discussed below.</p>
+
+<p role="practice">Server software designers (implementers) SHOULD 
NOT specify
+default representation metadata, such as media type, character 
encoding, or
+content language, within the standard configuration shipped with the 
server.
+</p>
+
+<p>Instead of specifying a default for metadata, it is better for
+representations to be sent without that metadata.  That allows the 
recipient
+to guess the metadata instead of being forced to either accept 
incorrect
+metadata or be tempted to violate Web architecture by ignoring it.</p>
+
+<p role="practice">Server managers (webmasters) SHOULD NOT specify 
an arbitrary
+Internet media type (e.g., "text/plain" or "application/octet- 
stream") when the
+media type is unknown.</p>
+
+<p>It is better to send no media type if the resource owner has 
failed to
+define one for a given representation.</p>
+
+<p role="practice">Authoritative metadata SHOULD NOT be provided 
external to
+the representation if it does not add clarity to that 
communication.</p>
+
+<p>For example, the character encoding of XML data formats is self- 
descriptive
within the data and SHOULD NOT be included in a charset parameter of the
media type unless that distinction is significant to the resource 
(e.g., for
comparison during content negotiation of multiple XML representations
-that differ only by character encoding).</p></item>
-</ulist>
+that differ only by character encoding).</p>
+
</div2>
<div2 id="silent-recovery">
@@ -468,13 +522,16 @@
from error perpetuates what could be easily fixed if the resource owner
is simply informed of that error during their own testing of the 
resource.</p>
-<p>Web agents SHOULD have a configuration option that enables
-the display or logging of detected errors. Such a display need not be
-disruptive of the user experience; for example, a graphical browser
-might display a small "bug" button in the user interface to indicate a
-detected error so that an interested user (i.e., the resource owner)
-can select the button, inspect the error, and perhaps modify the
-agent's choice on how to recover from that error.</p>
+<p role="practice">Web agents SHOULD have a configuration option 
that enables
+the display or logging of detected errors.</p>
+
+<p>Revealing errors when they occur need not be disruptive of the user
+experience. For example, a graphical browser might display a small 
"bug"
+button in the user interface to indicate a detected error so that an
+interested user (i.e., the resource owner) can select the button, 
inspect
+the error, and perhaps modify the agent's choice on how to recover from
+that error.  Naturally, the appropriate mechanism will be unique to 
each type
+of receiving agent and application context.</p>
<p>Some applications of the Web cannot tolerate error.  For example,
medical information systems must be designed so as to detect errors that
@@ -500,9 +557,8 @@
agent violates those expectations, it violates the protections that may
have been put in place for the user's self-protection.</p>
-<p>Because of those risks, it is an error for an agent to ignore or
-override authoritative metadata without the consent of the party
-employing the agent.</p>
+<p role="constraint">An agent MUST NOT ignore or override authoritative
+metadata without the consent of the party employing the agent.</p>
<p>Consent does not imply that the receiving agent must interrupt
the user and require selection of one option or another.
@@ -515,6 +571,16 @@
errors and ways in which interface designers might obtain user
feedback to address them.</p>
+<p>Likewise, consent may be implied by the nature or type of 
interaction
+being performed by the agent.  For example, a script that "mirrors" 
content
+from the Web into files on an FTP server is probably going to ignore
+metadata.  Similarly, XInclude <bibref ref="XInclude"/> processing 
has the
+implied consent of the user to transform data from one source to 
another
+and thus should only result in errors when the transformation is 
unsuccessful.
+Note, however, that this functionality imposes a social burden on 
XInclude
+processors to be sure that the resulting composed document does not 
violate
+the user's security constraints.</p>
+
</div2>
</div1>
@@ -538,38 +604,19 @@
that gives clients a hint about the likely media type if one were to
retrieve a representation of the identified resource.</p>
-<example>
-<head>Format specifications cannot redefine authoritative metadata</ 
head>
-
-<p>The MyFormat specification specifies a <code>type</code> attribute
-with external references that supposedly takes precedence over any 
other
-media type received as authoritative metadata. When <code>type</ 
code> is
-present, receiving agents are instructed to use its value and ignore
-any conflicting metadata provided by the sender.</p>
-
-<p>The MyFormat specification designers rationale for this departure 
from
-Web architecture is that such a definition of the <code>type</code> 
attribute
-allows content authors to work around misconfigured servers. They 
contend
-that this is necessary because, in many environments, content 
authors may
-not have sufficient access to the server configuration to assign the
-correct media type where it belongs.</p>
-
-<p>Should the MyFormat specification designers be allowed to ignore a
-principle of Web architecture and define <code>type</code> in this way
-just to remedy a potential configuration problem?</p>
-
-<p>Answer: Errors involving inconsistent metadata cannot be "fixed"
-by adding metadata to external references --- the metadata is 
inconsistent
-for all recipients of the message, not just the user agent.
-An agent that silently overrides server-provided metadata can create
-security risks and prevent errors from being detected and 
corrected.</p>
-</example>
+<p role="constraint">Specifications MUST NOT work against the Web 
architecture
+by requiring or suggesting that a recipient override authoritative 
metadata
+without user consent.</p>
<p>A format specification that includes metadata hints for clients
must make clear that, when these hints interact with server metadata,
-they are advisory only. Format specifications MUST NOT include
-requirements for clients to override server metadata without user
-consent.</p>
+they are advisory only. These hints provide metadata by external 
reference
+and thus will not be known to all of the other (intermediary) 
recipients
+of the representation. Errors involving inconsistent metadata cannot be
+"fixed" by adding metadata to external references, since the metadata
+is inconsistent for all recipients of the message (not just the user 
agent).
+An agent that silently overrides server-provided metadata can create
+security risks and prevent errors from being detected and 
corrected.</p>
<p>An architecturally sound description of an advisory attribute might
read:</p>
@@ -598,11 +645,12 @@
attribute in
<loc href="http://www.w3.org/TR/2005/REC-SMIL2-20050107/extended- 
media-object.html#adef-media-type">section 7.3.1</loc>
specifies that the value of <code>type</code> takes precedence over
-authoritative metadata for some protocols.  The specification is in 
error.
+authoritative metadata for some protocols.  That specification is in 
error.
Under no circumstances can a format specification change the meaning of
protocol interaction on the Web. Implementers MUST disregard that 
statement
in SMIL 2.0 and treat the type attribute as merely a means for
-content selection or for when authoritative metadata is 
unavailable.</p>
+content selection or for when authoritative metadata is unavailable.
+The error has been corrected in SMIL 2.1 <bibref ref="SMIL21"/>.</p>
</div1>
@@ -627,23 +675,24 @@
according to the HTML and CSS specifications.
</p>
-<p>Which party has neglected a principle of Web architecture: Stuart
+<p>Which party has neglected a constraint of Web architecture: Stuart
for the server misconfiguration, Tim's browser for silently overriding
the HTTP headers from the server, or Janet's browser for not detecting
that the content looked like HTML?</p>
-<p>Answer: By silently overriding metadata from the representation
-provider in the HTTP headers, Tim's browser did not respect Web
-architecture principles that promote shared understanding and
-security.</p>
-
-<p>Misconfiguration of the server is a fixable error.  If Stuart was
-using Janet's browser, he would see that error immediately and fix it.
-However, if Stuart uses the same browser as Tim for his testing, Stuart
-would not be informed of the error.  Tim's browser is the culprit 
here because
-it misrepresents the resource owner by ignoring the authoritative 
metadata
-without Tim's consent. Janet's browser respected the "Content-Type" 
header
-field and, in doing so, helps Janet detect a server 
misconfiguration.</p>
+<p>Answer: By silently overriding the authoritative metadata from the
+HTTP headers, Tim's browser did not respect Web architecture 
constraints
+that promote shared understanding and security.</p>
+
+<p>Misconfiguration of the server is a fixable error.  If Stuart had 
been
+using Janet's browser to test, he would have seen the error 
immediately and
+fixed it long before either Tim or Janet made their requests. 
However, if
+Stuart used the same browser as Tim for his testing, Stuart would 
not have
+been informed of the error.  The software developers of Tim's 
browser are
+the culprit here because the product misrepresents the resource 
owner by
+ignoring the authoritative metadata without Tim's consent. Janet's 
browser
+respected the "Content-Type" header field and, in doing so, helps Janet
+detect a server misconfiguration.</p>
</div2>
@@ -660,14 +709,13 @@
executes it, promptly sending a rude message to everyone on
Tim's address list (including Tim's mom).</p>
-<p>Which party has neglected a principle of Web architecture: Stuart
+<p>Which party has neglected a constraint of Web architecture: Stuart
for serving content about a vulnerability or Tim's browser for silently
overriding the HTTP headers from the server?</p>
-<p>Answer: By silently overriding metadata from the representation
-provider in the HTTP headers, Tim's browser did not respect Web
-architecture principles that promote shared understanding and
-security.</p>
+<p>Answer: By silently overriding the authoritative metadata from the
+HTTP headers, Tim's browser did not respect Web architecture 
constraints
+that promote shared understanding and security.</p>
<p>Authoritative metadata is an important aspect of Web architecture.
Agents that ignore authoritative metadata are broken, sometimes
@@ -677,7 +725,7 @@
</div2>
<div2 id="hint-scenario">
-<head>Misconfiguration and metadata hints</head>
+<head>Inconsistent metadata hints</head>
<p>Norm publishes an XHTML document that includes this link:</p>
@@ -687,18 +735,17 @@
<p>Although the link refers to an XSLT style sheet, Norm has set the
<code>type</code> attribute to "text/css". Stuart has configured the
-Web server so that the style sheet is served via HTTP/1.1 as
-"application/xslt+xml". With a user agent that understands XSLT but
-not CSS, Janet requests the content that includes this link. As it
-interprets the representation data, Janet's user agent reads the
-<code>type</code> hint and does not fetch the style sheet."</p>
+Web server so that representations of the resource "cool-style" are
+served via HTTP/1.1 as "application/xslt+xml". With a user agent that
+understands XSLT but not CSS, Janet requests the content that includes
+this link. As it interprets the representation data, Janet's user agent
+reads the <code>type</code> hint and does not fetch the style 
sheet.</p>
<p>Which party is responsible for the fact that Janet did not receive
content she should have: Stuart for the server configuration, Norm for
-stating that the style sheet is served as "text/css"
-when in fact it's served with a different media type, or Janet's
-user agent for not double-checking the media type with the
-server?</p>
+stating that the style sheet is served as "text/css" when in fact it's
+served with a different media type, or Janet's user agent for not
+double-checking the media type with the server?</p>
<p>Answer: Norm's mislabeling of content deprived
Janet of content she should have received.</p>
@@ -709,23 +756,20 @@
responsibility to manage the risk that it may become inconsistent with
the content available at the link target address.</quote> Janet's
client could have done more than merely read the <code>type</code>
-hint and decide to skip the style sheet. Users benefit from clients
-that allow different configurations for handling hints, including:</p>
+hint and decide to skip the style sheet, but the specific purpose of
+that hint is to reduce unnecessary requests and the associated 
latency.</p>
-<ulist>
-<item><p>Query the server, and when there is an inconsistency,
-choose the authoritative metadata, or</p></item>
-<item><p>Query the server, and when there is an inconsistency, prompt
-the user for instructions on how to proceed.</p></item>
-</ulist>
+<p>Users often benefit from agents that perform metadata consistency
+checks as part of special "authoring" or "testing" modes.  Such checks
+might query the server and check for inconsistency, thus allowing the
+metadata to be tested by authors without incurring overhead during
+operation by normal users.</p>
</div2>
<div2 id="dav-scenario">
<head>Conflicting metadata during distributed authoring</head>
-<p>[unfinished]</p>
-
<p>The meaning of any HTTP message is defined by the contents of that
message as interpreted according to the HTTP standard.  If a client 
requests
that a server store a representation at a given URI and the server's
@@ -733,7 +777,7 @@
from what has been provided by the client, then the server should reject
the request using an appropriate HTTP status code.</p>
-<p>In other words, if a webdav client performs a</p>
+<p>For example, if a WebDAV client performs a</p>
<eg><![CDATA[
     PUT /something.html HTTP/1.1
@@ -744,45 +788,63 @@
<p>and example.org knows that it has been configured such that all
resources with identifiers ending in in ".html" are represented
-in the "text/html" format, then the server has four choices:</p>
+in the "text/html" format (i.e., the server has been configured not
+to simply accept whatever the client wants for any given identifier),
+then the server could choose one of four potential choices for handling
+the request:</p>
<olist>
-<item><p>ignore the "application/pdf" metadata provided by the 
client, store the
-representation as-is, and serve it later as "text/html".</p></item>
+<item><p>ignore the "application/pdf" metadata provided by the 
client, store
+the representation as-is, and serve it later as "text/html".</p></item>
<item><p>change the configuration such that future 200 responses to
<code>GET /something.html</code> will be served as "application/pdf",
thus preserving the client's stated intent.</p></item>
<item><p>accept the request only in the sense of it being a requested 
change of
-resource state, resulting in the PDF representation being 
"converted" to HTML
-for later responses.</p></item>
+resource state, meaning that the PDF representation is automatically 
converted
+to HTML for use by later responses.</p></item>
<item><p>respond with "415 Unsupported Media Type" and a message 
stating why the
request is inconsistent with the resource.</p></item>
</olist>
-<p>(1) is clearly a bad idea because the inconsistency is an error
-and failing to report an error is bad design.</p>
-
-<p>(2) may be feasible on some HTTP servers that combine configuration
-for both authoring and read-only services, but most production HTTP
-servers do not work that way, and automatically overriding a server
-configuration is more likely to hide pilot-error rather than do what
-the user actually wants.</p>
-
-<p>(3) is a complicated option that preserves REST semantics but not
-those of a dumb filesystem.  It is one of those server-side magic
-tricks that tends to annoy people who think HTTP is a file protocol,
-which suits me just fine provided that it isn't mandatory.</p>
-
-<p>(4) properly informs the user of the inconsistency (enabling them
-to choose the right workaround), works in all cases, but wastes
-some bandwidth.</p>
-
-<p>Answer: (1) is a bug, (2) is bad implementation,
-(3) is a nifty feature when the user is making an informed
-request, and (4) is the right answer in all other cases.</p>
+<p>Ignoring the "application/pdf" metadata provided by the client 
(1) is
+clearly a bad idea because the inconsistency is an error and failing to
+report an error is bad design.</p>
+
+<p>Automatically changing the conflicting configuration (2) is 
appropriate
+if and only if the author has the ability to selectively override 
the server's
+configuration on a per-representation basis, has configured their 
Web space
+to do so, and the result of accepting the PUT does change that 
configuration.
+The primary use-case for this style of override is to continue 
supporting
+well-known "cool" URIs even though the identifier appears to contain 
metadata
+that is inconsistent with the current media type.  A better 
solution, though,
+is to simply redirect the old identifier to a new URI that does not 
contain
+an apparent file extension.  Unfortunately, the main problem with 
accepting
+the override is that the inconsistency may have been due to pilot error
+rather than user intention.  A good rule of thumb is to provide this 
behavior
+as a configuration option that is not the default.</p>
+
+<p>Performing on-the-fly type conversion, as in (3), is a 
complicated option
+that preserves Web semantics but can lead to unexpected results for 
authors
+that consider the Web interface to be just another dumb filesystem. 
This
+should only be done when the resource owner has specifically 
configured the
+resource (or space of resources) to process state changes in this 
manner.
+A better solution is to redirect the user to a codependent resource 
that
+provides "application/pdf" views of the shared state; the user can then
+choose whether or not to apply the state change to that resource, which
+will have a metadata configuration consistent with the representation
+being PUT, and thus preserve both Web and filesystem semantics.</p>
+
+<p>Responding with a "415 Unsupported Media Type" error (4) is, in most
+cases, the right answer unless the server has been specifically 
configured
+for options (2) or (3).  Although it costs time and bandwidth, 
responding
+with an informative error message allows the user to inspect both the
+request being made and the server's current configuration, change
+whichever one is incorrect, and thereby establish the correct metadata
+for the resource's representations <emph>before</emph> allowing the PUT
+to succeed.</p>
</div2>
</div1>
@@ -792,7 +854,12 @@
<p>The TAG is working with the authors of <bibref ref="rfc3023"/>
to revise section 7.1 of that RFC, which suggests behavior regarding
-character encoding metadata that is inconsistent with this finding.</p>
+character encoding metadata that is inconsistent with this finding.
+More information on that issue
+(<loc href="http://www.w3.org/2001/tag/ 
ilist.html#RFC3023Charset-21">RFC3023Charset-21</loc>)
+can be found in the TAG finding on
+<loc href="http://www.w3.org/2001/tag/2002/0129-mime">Internet Media 
Type
+registration, consistency of use</loc> <bibref ref="TAG-21"/>.</p>
</div1>
@@ -805,7 +872,8 @@
Assigned Numbers Authority (IANA)</bibl>
<bibl id="rfc1866" href="http://www.ietf.org/rfc/rfc1866"
-key="RFC1866">T. Berners-Lee, D. Connolly. <titleref>Hypertext 
Markup Language - 2.0</titleref>, RFC1866, November 1995.</bibl>
+key="RFC1866">T. Berners-Lee, D. Connolly. <titleref>Hypertext Markup
+Language - 2.0</titleref>, RFC1866, November 1995.</bibl>
<bibl id="rfc2046" href="http://www.ietf.org/rfc/rfc2046.txt"
key="RFC2046">N. Freed, N. Borenstein. <titleref>Multipurpose Internet
@@ -827,17 +895,29 @@
<bibl id="rfc3023" href="http://www.ietf.org/rfc/rfc3023.txt"
key="RFC3023">M. Murata, S. St. Laurent, D. Kohn. <titleref>XML Media
-      Types</titleref>, RFC3023, January 2001.</bibl>
+Types</titleref>, RFC3023, January 2001.</bibl>
<bibl id="SMIL20" href="http://www.w3.org/TR/2005/REC-SMIL2-20050107/"
key="SMIL20">J. Ayars et al. <titleref>Synchronized Multimedia 
Integration
Language (SMIL 2.0), Second Edition</titleref>, W3C Recommendation, 7 
January
2005.</bibl>
+<bibl id="SMIL21" href="http://www.w3.org/TR/2005/REC-SMIL2-20051213/"
+key="SMIL21">D. Bulterman et al., eds. <titleref>Synchronized 
Multimedia Integration
+Language (SMIL 2.1)</titleref>, W3C Recommendation, 13 December 
2005.</bibl>
+
<bibl id="SRGS10" href="http://www.w3.org/TR/2004/REC-speech- 
grammar-20040316/"
-key="SRGS10">A. Hunt, S. McGlashan eds. <titleref>Speech Recognition 
Grammar
+key="SRGS10">A. Hunt, S. McGlashan, eds. <titleref>Speech 
Recognition Grammar
Specification Version 1.0</titleref>, W3C Recommendation, 16 March 
2004.</bibl>
+<bibl id="TAG-21" href="http://www.w3.org/2001/tag/2002/0129-mime"
+key="TAG-21">T. Bray, ed. <titleref>Internet Media Type registration,
+consistency of use</titleref>, W3C TAG Finding, 4 September 2002.</ 
bibl>
+
+<bibl id="XInclude" href="http://www.w3.org/TR/xinclude/"
+key="XInclude">J. Marsh, D. Orchard, eds. <titleref>XML Inclusions 
(XInclude)
+Version 1.0</titleref>, W3C Recommendation, 20 December 2004.</bibl>
+
</blist>
</div1>
@@ -849,7 +929,8 @@
included substantial input from Roy T. Fielding, Stuart Williams, and
Dan Connolly.  Martin Dürst, Philipp Hoschka, Rob Lanphier, and 
Norman Walsh
provided reviews of prior drafts that improved this finding. This second
-edition has additionally benefited from the comments of Noah 
Mendelsohn.</p>
+edition has additionally benefited from the comments of Noah 
Mendelsohn,
+Mark Baker, and Julian Reschke.</p>
</div1>

Received on Wednesday, 8 March 2006 14:45:50 UTC