W3C home > Mailing lists > Public > www-html@w3.org > July 1995

Re: making old text publicly available on the web

From: Dan Riley <dsr@lns598.lns.cornell.edu>
Date: Fri, 07 Jul 95 11:52:56 -0400
Message-Id: <199507071552.LAA24846@lns598.lns.cornell.edu>
To: www-html@www10.w3.org
murray.altheim@nttc.edu (Murray Altheim) writes:
> Thorvaldur Gunnlaugsson (thg@althingi.is) writes:
> >There is lots of text around which could be made accessible
> >on the web but nobody has the time to mark up.
> >Frequently the only structure this text has is tabs and
> >formfeeds. HTML should support formfeeds in <PLAINTEXT>
> >so this little structure there is present in this text
> >does not get lost on the web.
> Unfortunately, PLAINTEXT is deprecated in HTML 2.0 and beyond, so you would
> be forced to put the entire document into a number of PRE elements.

Maybe I'm missing something here--if the documents aren't HTML, why
try to serve them as Content-Type: text/html?  Plain text, it seems
to me, ought to be served as Content-Type: text/plain -- that is what
it is there for, and presumably that is why PLAINTEXT is deprecated.
RFC 1521 is a little wishy-washy on formfeeds and tabs, at least for
character set US-ASCII:

   The complete US-ASCII character set is listed in [US-ASCII].  Note
   that the control characters including DEL (0-31, 127) have no defined
   meaning apart from the combination CRLF (ASCII values 13 and 10)
   indicating a new line.  Two of the characters have de facto meanings
   in wide use: FF (12) often means "start subsequent text on the
   beginning of a new page"; and TAB or HT (9) often (though not always)
   means "move the cursor to the next available column after the current
   position where the column number is a multiple of 8 (counting the
   first column as column 0)."

My reading of this is that a web browser ought to handle FF and TAB in
text/plain in the traditional fashion.  I don't know of any browsers
that will page text/plain on formfeeds, but that is a quality of
implementation issue that should be taken up with the browser authors,
not an HTML issue.

Dan Riley                          Internet:  dsr@lns598.lns.cornell.edu
Wilson Lab, Cornell University     HEPNET/SPAN: lns598::dsr (44630::dsr)
	      "Distance means nothing/To me." -Kate Bush
Received on Friday, 7 July 1995 11:52:57 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 30 April 2020 16:20:15 UTC