Message-Id: <9211292034.AA22237@pixel.convex.com> To: "Thomas A. Fine" <fine@cis.ohio-state.edu> Cc: www-talk@nxoc01.cern.ch Subject: Re: Questions and comments In-Reply-To: Your message of "Tue, 24 Nov 92 14:57:10 EST." <9211241957.AA15795@soccer.cis.ohio-state.edu> Date: Sun, 29 Nov 92 14:34:39 CST From: Dan Connolly <connolly@pixel.convex.com> >I'm new to this list, so forgive me if I hit things already dealt with. Actually, your questions are quite timely. >I'm implementing yet another browser (text mode, written in perl). >It's actually basically done. I have implemented the following tags: > >TITLE, A, NEXTID (currently ignored), ISINDEX (ignored), PLAINTEXT, >PRE, LISTING, XMP, P, H1-H6, HP1-HP6 (ignored), DL, UL, MENU, and DIR. > >also, I have done PRE and OL. But along the way I've seen several other >things in several different places. For instance the following seem to >be defined in viola: > COMMENT, XMPA, S, ST, VOBJ, XMPA >I've also seen references to DOCUMENT, KEYWORDS, DOCTYPE, and perhaps others. > >Which brings me to my main question: >Is there a definitive list somewhere of everything that's been proposed. The current specification is http://info.cern.ch/hypertext/WWW/MarkUp/MarkUp.html I hope to replace that with a more rigorous specification soon. I hope to use the same spec to register text/html with the IANA for MIME purposes. >Other stuff: >I'm not sure what the difference is supposed to be between an OL and >a UL. Should the browser actually sort the list items for a OL? An OL was never a sorted list. It's just a numbered list, as opposed to a bulleted list. It's for stuff where the order of the items in the list is significant; e.g. step 1: do this. step 2: do that... >Also, I was under the impression that PRE was like PLAINTEXT, meaning their >is no ending tag, just end of file. I hope I've misunderstood, if you >are proposing to replace XMP with PRE. Another problem with this replacement >is the quoting problem. With XMP, you don't need to worry about whether >or not your arbitrary text contains something which looks like an HTML >tag. This is an important feature, and one which should be kept IMHO. Well, you have to throw out SGML conformance if you want the current PLAINTEXT semantics. Even the XMP semantics are no good. In SGML, the string "</" is recognized as markup iff it's followed by a name start character (a letter). The above HTML documentation says </ is only markup if it's followed by XMP, i.e. "</XMP>" is the _only_ string that ends an XMP section. This is not expressible in SGML. I'm defining HTML in terms of SGML. Period. I'm punting on Plaintext. The idea is that plaintext data is not part of the HTML data format. Plaintext is governed by the MIME text/plain data format. Any HTTP servers that return some HTML followed by <PLAINTEXT> followed by more data are thought to return two MIME entities: a text/html entity, terminated by the <PLAINTEXT> tag, and a text/plain entity. As for the <PRE> tag, I think I'm going to call it FIXED, and go with a <p> tag at the end of every line. Details as they develop... Dan