Re: making old text publicly available on the web

murray.altheim@nttc.edu (Murray Altheim) writes:
> Thorvaldur Gunnlaugsson (thg@althingi.is) writes:
> >There is lots of text around which could be made accessible
> >on the web but nobody has the time to mark up.
> >Frequently the only structure this text has is tabs and
> >formfeeds. HTML should support formfeeds in <PLAINTEXT>
> >so this little structure there is present in this text
> >does not get lost on the web.
> 
> Unfortunately, PLAINTEXT is deprecated in HTML 2.0 and beyond, so you would
> be forced to put the entire document into a number of PRE elements.

Maybe I'm missing something here--if the documents aren't HTML, why
try to serve them as Content-Type: text/html?  Plain text, it seems
to me, ought to be served as Content-Type: text/plain -- that is what
it is there for, and presumably that is why PLAINTEXT is deprecated.
RFC 1521 is a little wishy-washy on formfeeds and tabs, at least for
character set US-ASCII:

   The complete US-ASCII character set is listed in [US-ASCII].  Note
   that the control characters including DEL (0-31, 127) have no defined
   meaning apart from the combination CRLF (ASCII values 13 and 10)
   indicating a new line.  Two of the characters have de facto meanings
   in wide use: FF (12) often means "start subsequent text on the
   beginning of a new page"; and TAB or HT (9) often (though not always)
   means "move the cursor to the next available column after the current
   position where the column number is a multiple of 8 (counting the
   first column as column 0)."

My reading of this is that a web browser ought to handle FF and TAB in
text/plain in the traditional fashion.  I don't know of any browsers
that will page text/plain on formfeeds, but that is a quality of
implementation issue that should be taken up with the browser authors,
not an HTML issue.

--
Dan Riley                          Internet:  dsr@lns598.lns.cornell.edu
Wilson Lab, Cornell University     HEPNET/SPAN: lns598::dsr (44630::dsr)
	      "Distance means nothing/To me." -Kate Bush

Received on Friday, 7 July 1995 11:52:57 UTC