Re: viewable vs downloadble attachment links from Jukka Korpela on 1999-01-12 (www-html@w3.org from January 1999)

From: Jukka Korpela <jkorpela@cc.hut.fi>
Date: Tue, 12 Jan 1999 09:42:47 +0200 (EET)
To: w3c html <www-html@w3.org>
Message-ID: <Pine.OSF.3.96.990112084442.28138B-100000@beta.hut.fi>
On Mon, 11 Jan 1999, Inanis Brooke wrote:

> I'd like to reiterate that even though this isn't an HTML issue,

Well, if it isn't an HTML issue, it does not belong to this list.
And if it is not about _future development_ of the HTML language
and its specifications, it doesn't belong here either.

So why am I writing this? Because I see one connection to the topic
of the list.

The connection is that in HTML 4.0, one can use the TYPE attribute
when writing a link (<A HREF=...). I haven't heard of any browser
support to it yet. But for the topic of this list, it is noteworthy
that there seems to be no specification or even a clue for what
a user agent _should_ do with it. The HTML 4.0 specification says:

  type = content-type [CI] 
   When present, this attribute specifies the content type of a piece of
   content, for example, the result of dereferencing a URI. Content types
   are defined in [MIMETYPES]. 
 ( http://www.w3.org/TR/REC-html40/struct/links.html#h-12.2 )

One of the problems is that this might conflict with the content type
announced in HTTP headers. And either of them might conflict with
the actual content of the data (i.e. the data might be in a format
which is not legal according to the specification of the content type).

What I'd to see the specs say clearly is what user agents are
required or allowed or recommended to do with the TYPE attribute
for A (and LINK) elements. (Analogous considerations apply to
CHARSET attribute too.)

For example, it could be something like the following, for
interactive user agents, or "browsers":
1. When a link is to be followed, a user agent _should_ check that the
value of the TYPE attribute (if present) and the media type announced in
the Content-Type header match. It _should_ report any mismatch.
However, a user agent _may_ provide a user option for disabling
such checking and reporting. A user agent _must_ regard the Content-Type
header as specifying the media type, unless explicitly requested by
the user to do otherwise.
2. In the absence of a Content-Type header, a user agent _may_
report an error and it _may_ handle the situation as a mismatch
(as in 1.). Unless explicitly requested to do otherwise, a user
agent _must_ in such a case behave as if it had received a Content-Type
header with the value specified in a TYPE attribute. (Note: This would
be rather exceptional - a requirement on error handling.)
3. However, a user agent _may_ check that the actual data conforms
to the specification of the media type (as determined by the Content-Type
header, the TYPE attribute, or user guess) and it _may_ treat
errors as mismatches (as in 1.).
4. When a link is followed, a user agent _must_ determine the media type
as specified above (giving priority to HTTP headers over TYPE attributes).
If the media type cannot be determined (due to lack of Content-Type 
header and a TYPE attribute), the user agent _must_ treat the resource as
if it had been announced as application/octet-stream. A user agent _must
not_ tacitly guess the type on the basis of the URL or the structure of
the actual data, but it _may_ do such a guesses, or alternative guesses,
and suggest them to the user.

This would make it somewhat meaningful to use TYPE attributes:
They would provide some checking possibilities (remotely analogous
to type checking in programming languages) without breaking into the
area which needs to be handled at the HTTP level.

Now, as regards to the original question, should it be possible
to specify that a resource is to be handled as "downloadable only"?
For example, should it be possible to announce data as, say,
application/msword, yet request that it be handled as
application/octet-stream? Specifically, should the latter be
regarded as compatible with any media type in the checks outlined
above? I'd say no. The media type should be announced honestly,
and in the same way in the TYPE attribute (if given) as in the
HTTP header. It shall remain as a user side decision how the data
is processed. An author may suggest that a document be just downloaded
and saved onto disk, but this is probably done in prose on the page
that provides a link to it. Someone might say that an extra attribute
could be introduced for the purpose of suggesting a method of processing;
perhaps a Boolean attribute, or would there be other meaningful
suggestions than "save onto disk"?

> it is an issue a webmaster has to be familiar with...

Naturally webmasters must know such things. Normal Web authors
need some knowledge about them too.

> putting the .doc file into a
> .zip file, and linking to that .zip file, will have the browser ALWAYS
> download the file (actually, pull up a dialogue - -

Nonsense. A browser processes data the way it has been programmed
and configured to do. The specifications impose some requirements
and restrictions on the processing, but not in this area. And you can
configure your browser to do what you like in this respect. For instance,
a browser could be configured to launch an unzipping program, perhaps a
fully automatic one which after unzipping launches some application(s) to
process the results. This has nothing to do with HTML, but I just had to
correct the misinformation.

-- 
Yucca, http://www.hut.fi/u/jkorpela/ or http://yucca.hut.fi/yucca.html
Received on Tuesday, 12 January 1999 02:43:12 UTC