Re: Questions and comments

Dan Connolly (connolly@pixel.convex.com)
Sun, 29 Nov 92 15:23:23 CST


Message-Id: <9211292123.AA23851@pixel.convex.com>
To: "Thomas A. Fine" <fine@cis.ohio-state.edu>
Cc: www-talk@nxoc01.cern.ch
Subject: Re: Questions and comments 
In-Reply-To: Your message of "Wed, 25 Nov 92 19:43:50 EST."
             <9211260043.AA26082@soccer.cis.ohio-state.edu> 
Date: Sun, 29 Nov 92 15:23:23 CST
From: Dan Connolly <connolly@pixel.convex.com>


>>been raised specifically to the increased real line length is that
>>long lines aren't safe to mail.  This problem can easily be
>>circumvented by using MIME & metamail (plug plug, although I have
>>nothing to do with it -- I'm just interested in seeing MIME used in
>>WWW).
>
>If that's the case then I'd have to say that it's ridiculous to try
>to fix mail (or any other transport mechanism) by mucking up HTML with
>crud.  Anyway, threre are already various programs that will let you
>break lines for use with mail.  Http shouldn't care at all about lines.

Amen, brother. Line length certainly should not be part
of the HTML spec.

As for HTTP, the 80 character line length was, is, and always will
be, merely a suggestion. HTTP implementations that assume 80 character
lines are broken. But I'm sure they're out there.

>Yes, MIME would be good.  But would you make WWW pass around MIME
>documents only, with HTML being one of the Content-Types, or would
>you have http handle several different doc types, including both
>MIME and HTML?

My strategy on integrating MIME is this:

Currently, a WWW or gopher client sends "gimme foo" and expects the
server to send foo back. The client _must know_ the data format of foo
in order to convey it to the user.

A gopher client knows foo's format from the first character of the
MenuItem where it got foo in the first place (it assumes gopher format
when foo is the empty string.)

A WWW client assumes foo's format is HTML, until it sees <PLAINTEXT>,
where it switches to plain text format.

Now if we just use the name text/plain for gopher type 0, application/gopher
for gopher type 1, text/html for the WWW data stream, and text/plain
for the stuff after the <PLAINTEXT> tag, we begin to see how MIME
fits into the picture.

I'd like to see the gopher protocol extended to include the actual MIME
content-type in the Menu, like this:

0#About This Gopher#about.txt#some.internet.host#70#text/plain
1#Departmental Publications#publications#some.internet.host#70#application/gopher

Now with text/plain and application/gopher, the MIME content-type
is redundant, and not so important. But consider:

9#My picture#connolly.gif#some.internet.host#70#image/gif
9#Pronunciation of my name#connolly.snd#some.internet.host#70#audio/basic

So in stead of registering zillions of special characters for new
gopher types (i, w, M, etc.) we just add a field to the Menu item,
and use type 9 for everything besides menus and text files.

The same holds for WWW references. They should include the data
type, with text/html as the default. So I should be able to
reference the above picture and sound in WWW:

<A HREF="gopher://some.internet.host:70/9connolly.gif"
	CONTENT-TYPE="image/gif">

<A HREF="gopher://some.internet.host:70/9connolly.snd"
	CONTENT-TYPE="audio/basic">

This is especially important for protocols that have no implicit data
type, like FTP.  I could reference a DVI on an FTP server, and have my
WWW client launch xdvi ala metamail:

<A HREF="ftp://export.lcs.mit.edu/contrib/foo.dvi"
	content-type="application/x-dvi">

Now the WWW gang has always talked about format negotiation.
This is where the client gives the server several options
for the data format, and the server chooses between them.
So what's the data type of the returned information?
We need a type that's a "union" of all the data types, right?

Now we see the need for the MIME message/rfc822 content type.
Consider the following scenario:

CLIENT: GET foo HTTP-Version-2 Content-Types:
	1000 text/html
	900 application/postscript
	200 text/plain
	400 application/x-dvi
	100 text/x-latex
	.

SERVER: 0200 Message follows:
From: author@his.host
Message-ID: <lasting-name-for-this-document>
Subject: ... fullfills role of HTML <TITLE> element ...
Mime-Version: 1.0
Content-Type: application/postscript
Content-Transfer-Encoding: binary

%!PS-Adobe-2.0
...


The next possibility to consider is multimedia documents, e.g.
one document that contains plain text, HTML text, gif images,
sounds, etc.

That's where the MIME multipart/mixed content type comes
in. If the client is prepared to receive multipart messages,
this is gravy.

Get it?

Dan