Re: ISSUE-128 (figure-in-p): Chairs Solicit Proposals from Henri Sivonen on 2010-10-07 (public-html@w3.org from October 2010)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Thu, 7 Oct 2010 04:55:42 -0700 (PDT)
To: HTML WG <public-html@w3.org>
Message-ID: <1734365040.12057.1286452542345.JavaMail.root@cm-mail03.mozilla.org>
Ian Hickson wrote:
> allowing desires regarding the parsing rules to affect the content
> model can lead to nonsensical content model rules, and insofar as the
> rendering languages should evolve to take care of common content
> models.

I think this line of argument is sound under our Separation of Concerns Design Principle. (Yeah, yeah, I know the Design Principles never got beyond first draft on the W3C publication track.) However, it isn't sound under our Priority of Constituencies Design Principle. For the purposes of that principle, your argument counts as theoretical purity while what's actually available in CSS for use is relevant to authors.

> This is the style used, for instance, on [1].

I notice you are using my site as an example. Here's a site whose CMS I contributed to many years ago: http://fimug.fi/mugi-illoista The figure there uses another style. It's the first child of a paragraph. (Aside: When working on that CMS, I seriously considered using <div class=p> instead of <p> to work around the annoying autoclose behavior of <p>.)

I think both kinds of figures are legitimate and authors should be able to use whichever way fits their current design. My Change Proposal makes both options possible. I immediately concede that choosing markup based on what fits the design at hand violates the principle of separation of structure and style, but my Change Proposal isn't trying to enforce such a principle. Instead, it's trying to give authors pragmatic options.

> What people do _not_ do with figures is put them in the middle of
> sentences. That's the kind of thing that elements of phrasing content
> (and embedded content, a subset of phrasing content) are for.

I think it's more important to make it possible to for authors to use <figure> as the first child of <p> than to make sure we declare the use of <figure> in the middle of a sentence Wrong. I could write validator code to make the validator whine about <figure> as a non-first child of <p>, but the exercise would remind me of "significant inline content". (http://www.whatwg.org/specs/web-apps/2006-01-01/#significant)

> The <p> element is said to have an optional end tag, but if we omitted
> the </p> in this case and yet did not have <figure> imply </p>, we
> would be making the author write non-conforming content:

Maybe we should not say that <p> has an optional end tag and should instead say that the end tag is optional before the HTML 4 legacy elements that autoclose <p>.

If we say </p> is optional before <section>, authors will be confused as long as they pay attention to pre-HTML5 UAs.

> This is just like how <table> and <aside> imply </p>:
...
> <p>The table in Figure 1 shows the tensile strength of materials as
> found in this experiment.
> <table>
> <caption> Table 1. </caption>
> [...]

Questions from validator users suggest that authors do not expect <table> to imply </p> and are surprised at it happening to the point of being unable to work out what happened when a validator complains about a stray </p> tag.

> This is not usually something we should consider, but sometimes it is
> worth looking at what rendering effects people want, just to make sure
> things are possible even while we wait for CSS to catch up to the
> latest needs of HTML.

Do you have any indication of the willingness of the CSS WG and, more to the point, CSS implementors to "catch up to the latest needs of HTML" here?

> In the rare case where the paragraph's margin-top is intentionally
> greater than the margin-bottom of the element before it, one runs
> afoul of the margin-collapsing rules for self-collapsing blocks
> implied by floats or positioned content between collapsing margins,
> and a minor adjustment to the markup is needed to make the alignment
> work again:
> 
> <h1>China launches Chang'e 2 lunar probe</h1>
> <div>

Suggesting that authors use a <div> to work around a limitation introduced by HTML5 is rather weak.

> POSITIVE EFFECTS
> ================
> 
> * <p>...<figure> will parse in a manner consistent with <p>...<div>,
> which is the markup used today for the same effect.

See http://fimug.fi/mugi-illoista for an example of a site working around the lack of <figure> in HTML 4 by using <span>s inside a paragraph.

> * Errors such as <em><figure>...</figure></em> will be caught and
> reported by conformance checkers.

If this is of importance, it could be achieved while still allowing <figure> as the first child of <p>.

> * Already implemented in multiple browsers and a conformance checker.

That can be fixed.

> NEGATIVE EFFECTS
> ================
> 
> None.

I obviously disagree.

> RISKS
> =====
> 
> None.

I think this is objectively incorrect. I we go ahead with with no edits and we later find your reasoning was wrong, irreversible damage will have been done to the ability of Web authors to use <figure>. If we go ahead with my Change Proposal and we later find my reasoning was wrong, we will be able to revert the change without causing any more damage than the introduction of new </p>-closing elements necessarily has to cause at some point.

Anne van Kesteren wrote:
> I think a point I missed in your description is that having e.g.
> <p><figure><pre> work, but <p><pre> break, as suggested by Henri, is
> highly illogical and confusing. Henri told me he thinks this okay
> because
> people only look at the nearest ancestor (i.e. parent-child
> relationships), but I do not think that is true. You often change your
> markup around and in this scenario if you removed <figure>, </p> would
> suddenly be implied before the <pre>, which is not really what you
> would expect.

Do you still remember bi-morphic content models? (http://www.whatwg.org/specs/web-apps/2006-01-01/#the-li)

Those didn't resonate well with people. Do you have data supporting the hypothesis that people think content models in terms of ancestor-descendant instead of parent-child? Note that DTDs have trained people to think in terms of parent-child and when a RELAX NG schema has an ancestor-descendant restriction but the validator doesn't say so but instead just says that you can't use element "foo" as a child of "bar" *in this context*, people get confused.

Also, is there some depth at which people stop thinking about ancestor-descendant? Note that we allow <p> as a descendant of <p> if there's <svg><foreignObject> in between.

Roy T. Fielding wrote:
> I agree. It would be very hard to explain the proper use of figure as
> an
> element name if it does not have the same characteristics of a figure
> in traditional written works.

I am not in any way suggesting that <figure> couldn't also be used in the way where you have a paragraph, then a figure under it and then a second paragraph under the figure. That pattern is indeed the dominant one in print, though floats with captions and text flowing around the float do exist.

The reason why floats are less important in print than on the Web is that in print are at least three-fold:
 1) In single-column books, it's usually not that critical to put a lot of stuff in a small space, so it's OK to occupy the whole width of the column for a figure.
 2) In print scenarios that try to put a lot of stuff in a small space, multiple columns are typically used, so smaller figures can still occupy the full width of one column.
 3) Multicolumn layouts that are higher than the viewport don't make sense on continuous screen media, because scrolling down and then back up for the next column is annoying. That's why (considerations around CSS multicol layouts not being supported in some browsers) multicolumn layouts are used less on the Web, which in turn makes floats more attractive as a means of not sacrificing the whole width of a single column worth of space for a figure when you want to pack text and figures in a small space.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Thursday, 7 October 2010 11:56:17 UTC