- From: Richard A. O'Keefe <ok@atlas.otago.ac.nz>
- Date: Thu, 31 Jan 2002 15:33:03 +1300 (NZDT)
- To: html-tidy@w3.org
I asked that <HR> be turned into </PRE><HR><PRE> when it occurs inside <PRE>. Klaus Johannes Rusch <KlausRusch@atmedia.net> replied: no disagreement on the intention to create valid HTML and do as little harm to the document as possible. Right. Changing tidy will break pages that rely on the current processing, which is treating the content of PRE as text and escaping tags (not sure if anyone has used it for that but comes handy when writing *ML code samples, just write them between <PRE> ... </PRE> and tidy formats them with < and > entity references) -- might introduce yet another flag to indicate how tidy should process <PRE> sections. But there is NO SUCH ANIMAL as "current processing". Different Web browsers do it DIFFERENTLY, as I explicitly pointed out. There is no one consistent cross-browser behaviour here. I don't know what "escaping tags" means here, but tags inside <PRE> are *not* in any sense "escaped". If you want to write *ML code samples, no, Tidy does NOT convert < and > to < and >, because some tags are perfectly legal, extremely useful, and very often used correctly inside <PRE>. If anyone wants to write *ML code samples, XHTML <![CDATA[...]]> sections are the way to do that. In HTML, the only way is to stuff them through your own entity-fying filter. Note that converting <HR> to <HR> inside <PRE> sections would definitely be wrong. Another item for discussion, since other block level elements, and IMG, OBJECT, BIG, SMALL, SUB and SUP are not allowed in PRE context either, should these get moved outside of the PREformatted section as well? You have missed a very important distinction. The element types you have mentioned are *inline*, but <HR> is a *block*-level element type. <PRE> is allowed to contain any inline content (stuff that would be legal in a <P>) except specifically IMG|OBJECT|APPLET|BIG|SMALL|SUB|SUP|FONT|BASEFONT It's a little odd that <FONT> should be disallowed when <I> and <B> are allowed, even odder when you realise that you can use CSS to attach font modification to things that _are_ allowed. As for <SUB> and <SUP>, I have never seen any justification for excluding them and have sometimes wanted them. But it's not legal HTML, so I don't do it. People who put <FONT> inside <PRE> aren't playing by the rules, but they _are_ playing by a plausible over-simplification of the rules: "<PRE> is just like <P> except that line breaks are honoured." <HR> has never been acceptable as inline content and isn't acceptable now. It does *not* "work" in any useful sense across browsers, as explained in a previous message. In theory that's correct, however browser implementations vary in their support for vertical spacing of block level elements. "Browser implementations vary." HTML in one pithy phrase. Frankly, I think there _is_ a good argument for turning <PRE> alpha beta gamma </PRE> into <PRE> alpha </PRE> beta <PRE> gamma </PRE> whenever alpha is inline and beta is a block-level element. <HR> is just a special case of this transformation. Since *no* block-level elements are allowed inside <PRE>, the only really likely change is vertical spacing, which could never be relied on in the first place. That more general transformation would make more pages work better in more browsers.
Received on Wednesday, 30 January 2002 21:33:11 UTC