RE: Type Attribute from Ernest Cline on 2003-11-24 (www-html@w3.org from November 2003)

From: Ernest Cline <ernestcline@mindspring.com>
Date: Sun, 23 Nov 2003 22:06:07 -0500
To: "Oskar Welzl" <oskar.welzl@pan.at>
Cc: "W3C HTML List" <www-html@w3.org>, "Lachlan Hunt" <lhunt07@postoffice.csu.edu.au>
Message-ID: <410-2200311124367187@mindspring.com>
Once again, I have a long message containing a brief summary here
followed by a more detailed explanation of my reasoning for those
who are interested.

1) Metainfo attributes such as type should be used to decide
   whether to retrieve a resource.
2) They should not be used to determine what the user agent asks
   a server for.
3) A returned resource should not be rejected because HTTP
   (or any other protocol that does so) indicates that its metainfo
   does not match what attributes say it should be.
4) When there is a discrepancy between metainfo provided
   by attributes and that provided by a protocol, that provided
   by the protocol should be used preferentially, but if the user
   agent can determine that it is incorrect it should try the value
   given by the attribute.

> [Original Message]
> From: Oskar Welzl <oskar.welzl@pan.at>
>
> Lachlan,
>
> > >i would like to add a fourth question:
> > >4. should the UA identify a remote resource using only the URI 
> > >given in @src (@data, @whatsoever), or should it be computed from 
> > >this URI plus @type. (in other words: if you strip @type from an 
> > >existing tag, will the UA in any case fetch the same file as before?)
> > 
> >    I'm not quite sure what you mean here, but the resource should be 
> > identified by the URI, and @type simply advises the UA what type(s)
> > the resource is available in, and thus allows the UA to decide:
> > 1. Whether or not to request the resource; and
> > 2. Which types, from the available list, are acceptable, and thus
> > 3. Which type to request.
> >   If @type is omitted, then of course the UA can still request the 
> > resource as they currently do.
>
> your point 3 suggests that it is *not* only the URI that identifies
> the resource.
> example:
> you suggest that <... src="valid-xhtml2" type="image/png"> should tell
> the UA to request "valid-xhtml2.png" instead of "valid-xhtml2.gif" or
> "valid-xhtml2.jpg" (both of which are available). without
type="image/png",
> it might happen that because of the server configuration,
"valid-xhtml2.gif"
> will be served instead.  therefore, following the "negotiating @type"-
> suggestion, the resource is *not* identified by the URI and it can *not*
be
> taken for granted that a UA will always retrieve the same file, regardless
> of a @type attribute.  following the "meta-information @type"-suggestion
>, both <... src="valid-xhtml2" type="image/png"> and <..src="valid-xhtml2">
> will *always* retrieve the same file, as it is specified by the URI and
only
> the URI.

But this non-guarantee of what is retrieved you are worrying about
occurs with content negotiation even without metainfo attributes.  Your
example, <..src="valid-xhtml2"> could return the data in any number of
formats, and any one of them could be returned by the server depending
upon the UA's capabilities.  I'll grant that having type or hreflang
influence
what the UA asks for  when the protocol is a negotiating one such as
HTTP adds to this "difficulty" but it does not introduce it. 

There are five questions that need to be addressed about
metainfo attributes such as type:

1) Should a metainfo attribute such as type be used by a UA
to decide whether to retrieve the resource?

Absolutely yes. If metainfo attributes don't affect resource loading,
there is no point in having such metainfo as part of XHTML2.
Rather it should be left to other areas such as RDF to detail
metainfo in a more complete way rather than having XHTML2
address the small amount of metainfo that it does address.
A user agent should follow the principle of assuming until proven
wrong that any metainfo provided is correct.  If based on metainfo
attributes, the user agent knows either that it cannot handle the
resource or the user has chosen to not receive that sort of
resource it should not retrieve it.

2) When a negotiating protocol such as HTTP is used, should
a metainfo attribute such as type be used to restrict the list
of acceptable choices the UA presents to the server?

Despite what I said above, I share Oskar's concerns that
metainfo could become a part of the "URL" that is not part
of the URL.  However, this concern is largely theoretical.
Also, the answer to the next question interacts with this one,
so I shall defer answering this question for the moment.

3) When a protocol that also returns resource metainfo, such as
HTTP is used, should a metainfo attribute such as type be used
to reject a returned resource if the server indicates that its metainfo
does not match the attribute?
This is a separate question from the second, as it is possible
that despite what the UA asks for, the server could choose
to return a best approximation in an attempt to fulfill the request.
If the answer to this question is "Yes", then in the interests of
efficiency, the answer to the second question should also
be "Yes" as there is no point in requesting a resource that
will not be used. However if the answer to this question is
"No" then We are left choosing between a small
potential savings in bandwidth as fewer choices are
presented to the server by the user agent and preventing
concerns that information outside of the URL could
become needed to get a specific resource. Since the
bandwidth savings are small (and indeed could be a
bandwidth cost if a version of the resource that has a
lower bandwidth cost is rejected as a result), I feel
that the URL concern should take priority, and that
as a result, the answer to question 2 should be "No"
if the answer to this question is "No."

Thus, altho they are separate questions, the answers
to questions two and three are linked, they should either
be both "Yes" or both "No".

In a negotiating protocol, the metainfo attributes while helpful
are not necessary.  The desired metainfo can be learned
via the negotiating protocol.  THEREFORE, it is my
contention that the use of metainfo attributes
with a negotiating protocol should be designed to
duplicate what happens with a non-negotiating protocol.

With this principle in mind, we can answer question 3.
Rejection of a resource by the user agent because of
a discrepancy in the metainfo ascribed to the resource
by the attribute and by the protocol cannot occur with
a non-negotiating protocol.  Hence, in the interest of
consistent results, no such rejection of should
occur and both question 2 and question 3 should be
answered "No."

4) When a negotiating protocol such as HTTP is used,
if there is a discrepancy between the metainfo provided
by the attribute and the metainfo provided by the protocol
which should be ascribed to the resource by the UA?

The principle followed by HTML 4 is that metainfo provided
by a protocol is always superior to that provided by an
attribute.  Changing this preference should only be done
with good reason.  The only reason I can see to do so
would be to obtain consistency of results regardless of
the protocol used.

There are three possible scenarios that could occur when
attribute metainfo and the protocol metainfo don't match.
they are, the attribute is correct, the protocol is correct,
and both are wrong.  If both are wrong, it doesn't really
matter which one was chosen. If either is correct then
even the most basic of error recovery procedures,
try the rejected alternative, will produce the same result
regardless of which source is given preference.  Only
in the case where no error recovery is attempted will
the choice of which to use will produce any effect.  If the
attribute is followed, then the result will be consistent
regardless of protocol.  If the protocol is followed,
then negotiating protocols will produce a different result
whenever the attribute and the protocol disagree.

This could be interpreted in either of two ways.
A) In order to maintain consistency with HTML 4,
XHTML2 while regarding metainfo provided by
a  protocol superior to that provided by an attribute,
must attempt processing the resource using
attribute metainfo when protocol metainfo has
been demonstrated to be incorrect.
B) In order to maintain consistency between
protocols, XHTML2 shall recognize attribute
metainfo as being superior.

We have here a choice between mandating
a minimal level of error recovery or mandating
XHTML2 use a method directly opposite to that
of HTML 4.  In this case, I think that the lesser of two
evils is to require a minimal level of error recovery.
For example, in <... type="image/gif" src="image">
if the protocol says the resource is a PNG and it is not,
the user agent must attempt to see if it is a GIF.
If it is also not a GIF, then the answer to this question
says nothing about any other methods of error
recovery, either pro or con.

5) When the UA is able to determine that a returned
resource does not match the metainfo ascribed to it,
MAY the UA attempt to determine the correct metainfo
from the contents of the resource?

This question gets into the realm of error recovery more
than of metainfo.  It is my opinion is that if XHTML says
anything about this, it should be something along these
lines:  "If the metainfo provided by both the protocol and
the attributes prove to be incorrect, a user agent MAY 
offer the user the option of attempting to correct this error
by inspecting the resource and determining a reasonable
value for the metainfo.  However, a user agent MUST offer
the user the option to not attempt correcting this error, as
it is impossible for a user agent to be certain that a resource
that does conform to metainfo that it knows of was intended
to be associated with that metainfo."

(Or in other words, a user agent may attempt error correction,
but it must offer the user the choice to not attempt it.)
Received on Monday, 24 November 2003 04:29:27 UTC