Re: Ignoring empty paragraphs

On Mon, 10 Apr 2000, Ian Graham wrote:

> On Sun, 9 Apr 2000, Braden N. McDaniel wrote:
> > On Sun, 9 Apr 2000, Jan Roland Eriksson wrote:
> > 
> > > On Tue, 4 Apr 2000 03:09:26 -0400 (EDT), "Braden N. McDaniel"
> > > <braden@endoframe.com> wrote:
> > > 
> > > >On Tue, 4 Apr 2000, Jan Roland Eriksson wrote:
> > > >> On Mon, 3 Apr 2000 19:53:18 -0400 (EDT), "L. David Baron"
> > > >> <dbaron@fas.harvard.edu> wrote:
> > > >> > 1) An empty P element should be ignored at the parsing stage, and
> > > >> >    therefore should not appear in the DOM and should not be affected
> > > >> >    by style sheets.
> > > >> 
> > > >> This is the correct interpretation.
> > > 
> > > [...]
> > > 
> > > >> If there's nothing to mark-up, there's no motivation for markup either.
> > > 
> > > >Indeed, but it is *not* the parser's job to fix errant document structure! 
> > > >It is the parser's job to read the markup that's there. And as long as
> > > >it's valid, the DOM tree should have a direct correspondence to the
> > > >plaintext representation.
> > > 
> > > Fair enough. But...
> > > 
> > > What about "styling" of non existing content?
> > > Leave that no-content element dangling in the DOM tree and we need to
> > > move the decision not to style it to the CSS renderer instead.
> > > 
> > > If not, we will not have a way to discourage the use of successive P's
> > > for vertical spacing, and that is what I think David's question was all
> > > about.
> > 
> > Hm. That's a good point. I think the bottom line here is that the rule in
> > the HTML spec is Stupid: if the spec authors wanted to discourage empty P
> > elements, they should have made them altogether illegal.
> > 
> > But I've come around to agree with you on this. The HTML spec appears to
> > make it the job of the parser to fix bad markup. The wording is, "User
> > agents should ignore empty P elements," not, "User agents should hide
> > empty P elements."
> 
> I think the wording in the HTML spec should not be trusted -- it is simply
> too vague.  The intention, if I remember correctly, was that consecutive
> empty <p>'s would, when rendered, collapse to the same vertical spacing of
> a single <p>, or to nothing at all.

If you remember from *what*? What can we consult, if indeed the wording in
the HTML spec is not to be trusted?

> The problem is that the HTML spec
> doesn't say how this could be done, since it is a markup spec, and not a
> formatting/parsing specification.
> 
> For someone writing DOM code that access a document, it is unacceptable
> that the parser/processor can arbitrarily decide to modify the data
> structures by removing data from the document it receives.  For example, I
> (or some auto-generation tool pumping out valid HTML) could produce a
> document containing something like:
> 
> <p id="part1"> </p> 
> <p id="para2"> </p> 
> 
> and then later use script code to appropriately fill the <p>'s. Obviously
> the code will fail if the parser/processor has decided to prune these
> empty but needed elements from the tree.

The HTML spec doesn't know anything about the DOM, really. I only
interpret it as referring to P elements described in static documents.

> Moreover, with XML this would simply be illegal -- an XML parser can
> _never_ modify the incoming data, as Tantek pointed out.

Irrelevant. HTML is not XML.

> All it can do is
> tell the XML application whether or not white space is significant in
> certain contexts. It does not make sense at this point to let HTML
> applications do things that XHTML ones cannot.

They already can.

> I think Jason's idea of an :empty pseudo-class is the most appropriate way
> of handling the rendering issue. Indeed, you then have much finer control
> over the formatting process, and in a way that can apply to other elements
> also. For example, you could have a rule such as:
> 
> p:empty p:empty  { display: none}
> div:empty div:empty {display: none}
> 
> to remove consecutive empty paragraphs and consecutive empty divs from the
> rendering process.

That might be useful. I'm not convinced it reflects the intention of the
spec on this point.

> Regardless, it would seem useful to change the wording of the HTML
> specification (Section 9.3.1) to more carefully say what this "really"
> means. Something like:
> 
>   We discourage authors from using empty P elements. User agents should
>   not render empty P elements. However, style sheet instructions should be
>   able to control whether or not empty P (or other) elements are included 
>   in the rendering process.
> 
> might be better. 

Decisive evidence that such language does, in fact, represent the intended
meaning of the spec on this point would probably settle this.

-- 
Braden N. McDaniel
braden@endoframe.com
<URL:http://www.endoframe.com>

Received on Monday, 10 April 2000 14:37:47 UTC