W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2002

Re: <HR> in <PRE>

From: Richard A. O'Keefe <ok@atlas.otago.ac.nz>
Date: Thu, 31 Jan 2002 15:33:03 +1300 (NZDT)
Message-Id: <200201310233.PAA271991@atlas.otago.ac.nz>
To: html-tidy@w3.org
I asked that <HR> be turned into </PRE><HR><PRE> when it occurs inside <PRE>.

Klaus Johannes Rusch <KlausRusch@atmedia.net> replied:
	no disagreement on the intention to create valid HTML and do as little harm to 
	the document as possible.
	
Right.

	Changing tidy will break pages that rely on the current
	processing, which is treating the content of PRE as text and
	escaping tags (not sure if anyone has used it for that but comes
	handy when writing *ML code samples, just write them between
	<PRE> ... </PRE> and tidy formats them with &lt; and &gt; entity
	references) -- might introduce yet another flag to indicate how
	tidy should process <PRE> sections.
	
But there is NO SUCH ANIMAL as "current processing".
Different Web browsers do it DIFFERENTLY, as I explicitly pointed out.
There is no one consistent cross-browser behaviour here.

I don't know what "escaping tags" means here, but tags inside <PRE>
are *not* in any sense "escaped".  If you want to write *ML code samples,
no, Tidy does NOT convert < and > to &lt; and &gt;, because some tags
are perfectly legal, extremely useful, and very often used correctly
inside <PRE>.

If anyone wants to write *ML code samples, XHTML <![CDATA[...]]>
sections are the way to do that.  In HTML, the only way is to stuff
them through your own entity-fying filter.

Note that converting <HR> to &lt;HR&gt; inside <PRE> sections would
definitely be wrong.

	Another item for discussion, since other block level elements,
	and IMG, OBJECT, BIG, SMALL, SUB and SUP are not allowed in PRE
	context either, should these get moved outside of the
	PREformatted section as well?

You have missed a very important distinction.  The element types you
have mentioned are *inline*, but <HR> is a *block*-level element type.
<PRE> is allowed to contain any inline content (stuff that would be
legal in a <P>) except specifically
    IMG|OBJECT|APPLET|BIG|SMALL|SUB|SUP|FONT|BASEFONT
It's a little odd that <FONT> should be disallowed when <I> and <B> are
allowed, even odder when you realise that you can use CSS to attach
font modification to things that _are_ allowed.  As for <SUB> and <SUP>,
I have never seen any justification for excluding them and have sometimes
wanted them.  But it's not legal HTML, so I don't do it.

People who put <FONT> inside <PRE> aren't playing by the rules,
but they _are_ playing by a plausible over-simplification of the
rules:  "<PRE> is just like <P> except that line breaks are honoured."

<HR> has never been acceptable as inline content and isn't acceptable
now.  It does *not* "work" in any useful sense across browsers, as
explained in a previous message.

	In theory that's correct, however browser implementations vary in their support
	for vertical spacing of block level elements.
	
"Browser implementations vary."  HTML in one pithy phrase.

Frankly, I think there _is_ a good argument for turning
    <PRE> alpha beta gamma </PRE>
into
    <PRE> alpha </PRE> beta <PRE> gamma </PRE>
whenever alpha is inline and beta is a block-level element. <HR> is just
a special case of this transformation.  Since *no* block-level elements
are allowed inside <PRE>, the only really likely change is vertical
spacing, which could never be relied on in the first place.  That more
general transformation would make more pages work better in more browsers.
Received on Wednesday, 30 January 2002 21:33:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:51 GMT