Message-Id: <9211241603.AA07624@pixel.convex.com> To: timbl@nxoc01.cern.ch Cc: Rik Harris <rik@daneel.rdt.monash.edu.au>, www-talk@nxoc01.cern.ch Subject: Re: The <PRE> tag In-Reply-To: Your message of "Tue, 24 Nov 92 12:10:56 +0100." <9211241110.AA03386@www3.cern.ch> Date: Tue, 24 Nov 92 10:03:21 CST From: Dan Connolly <connolly@pixel.convex.com> >> Date: Tue, 24 Nov 92 21:54:37 -1000 >> From: Rik Harris <rik@daneel.rdt.monash.edu.au> > >> I think the <PRE> tag is a great idea, too. The problem with not >> having newlines significant is that it makes it difficult to do >> indenting, etc. One of the reasons the <PRE> tag is nice is that you >> can take text (eg, manual entries) and not worry about formatting: > >I was sugesting that you should format the above like > > OPTIONS<p> ><p> > -b this option performs the blah command. And if this line is<p> > reasonably long, I can demonstrate what I'm talking about.<p> ><p> > > -f this option performs the foo command. Another annoying prob-<p> > lem is hyphenation.<p> > >That is, you explicitly put in the line end, but all white space is significan t >on the line.. It means that lines like > > See also csh, cc, blah, fred and junk. > >which would have to be a SINGLE LINE >See also <a name=csh href=csh.html>csh</a>, <a name=cc href=cc.html>cc</a>, <a >name=blah2 href=http://sdf.adf.uasdf.edu/fred/doc/junk/blah.html>blah</a>, <a >name=fred href=fred.html>fred</a> and <a name=junk href=junk.html>csh</a>. > >instead could out as for example > >See also > > <a name=csh href=csh.html>csh</a>, > > <a name=cc href=cc.html>cc</a>, > > <a name=blah2 href=http://sdf.adf.uasdf.edu/fred/doc/junk/blah.html>blah</a>, > <a name=fred href=fred.html>fred</a> and <a name=junk href=junk.html>csh</a>. ><p> > >which is mailable. If you look atthe NJIT manual pages HTML, there is a >mixture of significant line feeds and explicit <p> elements for blank lines: > > OPTIONS ><p> > -b this option performs the blah command. And if this line is > reasonably long, I can demonstrate what I'm talking about. ><p> > > -f this option performs the foo command. Another annoying prob- > lem is hyphenation. ><p> > >I propose we settle for one or the other. I wonder whether there is >anything in SGML to suggest which. In fact, there is. Well, not actally in SGML, but in the "application conventions" that I have used to map SGML onto WWW. All elements in HTML have either mixed content, RCDATA, or CDATA. Mixed content is a mixture of <tags>, &entity; references, and #PCDATA. RCDATA is just &entities; and data. CDATA is just data. [SGML actually has a couple other content modes: ANY and element content, but I didn't use those.] CDATA is only used for the TITLE. RCDATA is used for XMP and LISTING (entity references _are_ recognized in RCDATA sections, so you can inlclude the _full_ end tag like this: </XMP>. But the string </ followed by a letter _ends_ the section, whether the letter starts the XMP tag or not.) The convention is that in PCDATA sections, newlines serve only to delimit words, whereas in RCDATA, newlines are significant. We can't use RCDATA for the PRE or FIXED tag, cuz the <a> tag won't be recognized in RCDATA. So I'd suggest you ignore newlines inside the PRE element, and use <p> to delimit lines. And since we're not using the exact semantics of PRE, I like the idea of using the name FIXED in stead. In SGML: <!ELEMENT FIXED - - (#PCDATA|A|P)*> The fact that the MidasWWW browser can support the semantics of PRE is due to its non-standard parsing, where it treats illegal tags as data, rather than ignoring them. SGML says they'r not data, whatever they are, and the HTML doc in the web says to ignore them. I'm integrating my low-level SGML reading routines into MidasWWW now, and with the author's consent, the non-standard behaviour will soon go away. [The MidasWWW 1.0 browser doesn't do < or & either -- that too will change.] I've got it running, but there are a couple integration bugs I haven't yet tracked down. I've also got something of a validation suite for HTML, so that implementors can easily see if they've gotten it right. And the suite goes from easy to hard, so they can see how much of it they got right, and if they don't want to fix it, they can at least document how much it's broken. Dan