Re: Interaction of text-indent, ::first-letter and float

On Tue, Aug 10, 2010 at 11:32:19AM -0700, Tab Atkins Jr. wrote:
> On Tue, Aug 10, 2010 at 10:45 AM, Daniel Schattenkirchner
> <schattenkirchner.daniel@gmx.de> wrote:
> > The interaction of text-indent with floated ::first-letter is currently
> > specified in a way that creates unexpected results.
> >
> > The issues I see can be demonstrated using these lines of code:
> >
> > <div style="text-indent: 5em;">This line is part of the test.</div>
> >
> > div::first-letter {
> >  float: left;
> >  font-size: 2em;
> > }

I haven't looked much at pseudo-element stuff, so I may have made a mistake
here, but my analysis is as follows.  I try to describe in terms of how someone
who hasn't already read the whole spec might determine the answers, in the
hope that this shows up any problems where a person might not come to the
correct answer to a question.  For most or all of the issues I've identified,
I've also tried to suggest changes that alleviate that issue.


A reader might well start at the property description for 'text-indent'.

The accompanying description describes the effect for "the block", but what is
"the block" ?  A person already familiar with CSS would know that there are two
blocks in our example, and that we have to find the computed value of
'text-indent' for each of them.  I don't know what to suggest to make it more
likely that readers would come to realize that there are two blocks in our
example, but I think we'd get a long way towards that if each property
description would explicitly mention that the behaviour depends on "the
computed value" of the property, and that the reader be able to find how to
calculate computed values (whether by hyperlink, which would be better for
newcomers to the spec, or whether we rely on the reader looking in the index or
table of contents).  Such a change would certainly help determine the effect of
'text-indent' on the <div>, and with luck the reader might also think to
determine the value of 'text-indent' applicable to the pseudo-element.

The index entry for "computed value" points to section 6.1.2, which refers the
reader to "the Computed Value line in the definition of the property".

Note that the property table itself has "Computed value" with a lowercase v,
which is not ideal.

The Computed Value line in the definition of 'text-indent' says:

  Computed value: the percentage ... or the absolute length

That's enough that the reader can guess the computed value for the
<div> element and from there hopefully to the right line box,
whereas for the :first-letter we'd have to return to section 6.1.2,
since "the absolute length" doesn't tell us what absolute length.

Section 6.1.2 refers us to "the section on inheritance for the definition of
computed values when the specified value is 'inherit'".  To find whether this
sentence is applicable, we need to know what the specified value is for
the :first-letter pseudo-element [or at least whether or not it's 'inherit'].

Unfortunately, by a plain English reading of the phrase "specified value", that
sentence doesn't apply to our situation, since I think most people would say
either that there is no value of 'text-indent' specified for the
pseudo-element, or that the value specified is '5em' and not 'inherit'.

Because the intended meaning differs from the natural reading of that phrase,
I suggest that at least this occurrence of the phrase "specified value" be made
a hyperlink to the relevant definition.  It happens that the relevant definition
is in the very previous section, but because we've arrived here from the index
(or possibly table of contents), we haven't seen that previous section, and it
probably isn't visible on our monitor if we're using a web browser (which,
incidentally, might be considered the authoritative way to read the spec
according to the last sentence of section 1.2; in any case it's certainly the
usual way the spec will be read).

[Incidentally, much the same suggested improvement applies to phrases such as
 "positioned element" or "absolutely positioned element", and probably others.]

Let's suppose that we nevertheless know that "specified value" is a technical
term in CSS, and we look it up in the index, which points us to section 6.1.1.

Section 6.1.1's first rule is

  If the cascade results in a value, use it.

As a minor nit, the linked-to section describes a set of steps, but isn't clear in whether
or not those steps "result in a value" for a given case.

More importantly, the linked-to text (or rather its subsection 6.4.1) only
gives rules for finding "the value for an element/property combination", which
isn't relevant to our situation: we're trying to find the specified value for a
pseudo-element, and section 5.10 is fairly clear that pseudo-elements aren't
elements (e.g. "no element refers to the first line of a paragraph"), or one
could get the same conclusion by finding the definition of "element" from the
index.  (Rule 3 does mention pseudo-elements explicitly, but that seems
specific to rule 3, and isn't enough to establish that this set of rules yields
the specified value for a pseudo-element.) So I suggest that section 6.4.1
needs to be changed as a technical soundness issue; a minimal change would be
to add text such as "In this section, the word "element" should be read as
including pseudo-elements.".

Anyway, suppose we apply those rules and determine that it doesn't yield a value.
So we proceed to rule 2 of section 6.1.1:

  # Otherwise, if the property is inherited and the element is not the root of
  # the document tree, use the computed value of the parent element.

The word "inherited" is a hyperlink to some text that allows us to determine
that the 'text-indent' property is inherited.  We still have the now-familiar
problem of "what is 'the element'", but let's proceed anyway.
Now we have to determine what "the parent element" is.
Again, suppose we work that out, e.g. from the text in the section on ":first-line"
(if we for some reason work out that determining the behaviour of :first-letter
depends on text in the section on :first-line).

We've now determined that the specified value for the <div:first-letter>
is 50 absolute units or whatever it may be, and we're trying to determine
what the computed value is.
We're referred to the aforementioned Computed Value line,
which says that it's "the absolute value", and it's up to the reader to guess
that the absolute value in question is the specified value.

The 'text-indent' property table also says:

  Applies to: block-level elements, ...

Yet again this boring old question of "is a pseudo-element an element".
We've encountered this question so many times that by now we know the
answer is going to be "yes, that includes pseudo-elements."

But we'd be wrong :-( .

It turns out that in this case, "element" mustn't be read to include
"pseudo-elements" (as we'll see below).

That's why it's important for each section of chapter 6 to be clear
that the usual meaning of element excludes pseudo-elements.

However, even if the reader doesn't immediately assume that "element" includes
pseudo-elements, it's still quite likely that the reader will conclude that it
includes pseudo-elements in this case: none of the properties explicitly
mention pseudo-elements in their Applies-to line, and clearly at least some
properties are intended to apply to pseudo-elements (such as 'text-transform'
in the very first example in section 5.12.1, where 'text-transform's Apply-to
value is "all elements"), so a reader would suppose that the fact that it's a
pseudo-element rather than an element isn't enough to exclude it from the
Applies-to class.

A person with complete knowledge of the spec would know that the section
describing :first-letter specifies the set of properties that apply to
the <div:first-letter>, and that user agents may or may not apply
'text-indent' to it ("UAs may apply other properties as well").

Although there's a fair chance that a reader would in fact read the section
describing :first-letter, I also think that the reader assumption described
above is a reasonable one: that the Applies-to value is a complete description
of what this property applies to, and that it isn't necessary to look at any
other text (beyond the definition of block-level element) to decide whether or
not the property applies.  A person who looked for the definition of what the
Applies-to field means would find section 1.4.2.3, but that still doesn't give
any indication that pseudo-elements are special.  Neither does section 5.10,
the section that introduces pseudo-elements.  Towards the end of this message
I give some suggestions on how to change the text to make it more likely that
a reader would find this out.


It's unclear whether a user agent's permission to apply other properties
depends on the value of 'display' for the pseudo-element, but for the current
case if we follow much the same steps as above then we eventually determine
that the pseudo-element has computed values float:left and display:block
(thanks to the Computed Value line for 'display' correctly pointing us to
section 9.7).

We'll assume that this is HTML and thus that the <div> is also display:block
and that it's a block-level element.

So far we've concluded that the computed value of 'text-indent' is the same
absolute length for both the <div> and the <div:first-letter>, but that the
'text-indent' may or may not actually apply to (i.e. actually have an effect
on) the <div:first-letter>, depending on the user agent.

If the reader is an author and did see the text in 5.12.2 that says that the
rendering can depend on the user agent, then this should be a clue that it
would be good to change the styling such that the <div:first-letter> has a
computed 'text-indent' value of 0, for example by explicitly adding a
"text-indent: 0" declaration to the :first-letter selector.

But let's suppose that the reader nevertheless wants to know what the correct
renderings are, for example to know whether or not to file a bug report or
(as an implementor) to know whether their implementation's rendering is
conforming.

Section 9.5.1 is clear that the float is placed such that its left outer edge
touches the left edge of its containing block, so the distance from the left
edge to the T is the float's left margin+border+padding + text-indent;
let's assume that left margin+border+padding equals zero (the default computed
values), so it's just the text-indent.

The 'text-indent' property description says that it

  # specifies the indentation of the first box that flows into the block's
  # first line box.  The box is indented with respect to the left (or right,
  # for right-to-left layout) edge of the line box.  User agents should render
  # this indentation as blank space.

This text is a bit hard for someone looking at the spec for the first time
to come to know what it means, just because it requires finding out what
line boxes are and where they're placed.  Following the index link
for "line box", we get to section 9.4.2.  We're trying to find where the
left edge of the first line box is for each of the two blocks.
The fourth paragraph of 9.4.2 says

  # In general, the left edge of a line box touches the left edge of its
  # containing block and [similarly for right edge].  However, floating boxes
  # may come between the containing block edge and the line box edge.  Thus,
  # although line boxes in the same inline formatting context generally have
  # the same width (that of the containing block), they may vary in width if
  # available horizontal space is reduced due to floats.

What's the containing block of a line box?  The index gives three references
for "containing block", the first being section 10.1.  Whether this section
answers what the containing block of a line box is depends on whether we
read "the element's box(es)" as including line boxes, and more specifically
whether there is a unique element whose boxes include line boxes.  This is
unclear: for example, a reader might think that we don't want "the element's
box(es)" to include all boxes generated by descendent elements or all
descendent boxes of the boxes that the element generates, so might conclude
that similarly "the element's box(es)" don't include line boxes, and that the
phrase is likely intended to be limited to the element's principal box and its
marker box (if any, in each case).

So let's go to the other two references in the index.  It turns out that
both of these reference essentially the same place (one being a link to
section 9.1.2, and the other a link to the <dfn> for containing block that
appears in that section).  Here we're told that a containing block is "a
rectangular box" [which in context seems to mean "a rectangle" rather than a
CSS box, e.g. otherwise the box would have multiple edges, and see also rule 1
in section 10.1 (giving the containing block in which the root element lives),
which is quite clear that the root's containing block is "a rectangle"];
and we learn that

  # The phrase "a box's containing block" means "the
  # containing block in which the box lives," not the one it generates.

Despite the promising repetition of the phrase "in which X lives" between
sections 9.1.2 and the first rule in section 10.1, there are no other
occurrences of "lives" in normative text in CSS2.1.

As a reader I'd still not be confident of having the right answer to
the question of what the containing block is for a line box,
so we should think about whether we can make this clearer (a single new
sentence in section 10.1 would suffice and would be worthwhile adding).

I won't continue the analysis in detail.  If I guess correctly what the section
on floats will tell us about where the float goes and where the left edges of
the relevant line boxes are, then my conclusions are that both the rendering
labelled "IE6 to IE8, Presto and WebKit" and that labelled "Gecko" below are
conforming renderings, and that the IE9 preview #3 one is not conformant to the
current CSS2.1 text.

So it's unfortunate that the sole non-conforming rendering (according to my analysis)
is the one that the author says he expected.

[I previously wrote privately to Daniel Schattenkirchner concluding that
 only the IE6..8/Presto/WebKit rendering was correct, as I hadn't then noticed
 the text saying that only certain properties apply to :first-letter.]

> > I'm an author and when I wrote these lines, I expected this rendering:
> >
> >     --- his line
> >      |  is part
> > of the test.
> >
> > Note: This is what IE9 (starting with Preview #3) renders.
> >
> > Issue #1:
> >
> > I think this is related to the issue discussed not long ago in [1].
> >
> > IE6 to IE8, Presto and WebKit render this:
> >
> >     ---      his line
> >      |  is part of the
> > test.
> >
> > While Gecko renders:
> >
> > ---      his line
> >  |  is part of the
> > test.
> >
> > Both look incorrect to me. The latter seems to be what the spec intended.
> > [...]

I won't comment on Daniel Schattenkirchner's suggestions for normative changes
to what rendering the spec should specify for this example.

I'll only make suggestions on how to change the text so that one can safely
determine what the conforming behaviour(s) are for a given example.  These
suggestions address the communication of the behaviour of pseudo-elements
generally, so their applicability shouldn't be affected by any normative
changes that might be made in response to Daniel's concerns.

I suggest the following changes to make it more likely that a reader would come
to the right conclusion as to whether or not a given property applies to a
given pseudo-element:

  - In each property table, make "Applies to" a hyperlink to section 1.4.2.3.
    (Easy to do with sed or perl, and probably not too hard even in a
    GUI-based editor.)

  - Have section 1.4.2.3 refer to the text described below for the
    specification of which pseudo-elements a property applies to.

  - I suggest that somewhere there should be some text that describes
    pseudo-elements generally, whether it's in section 5.10 or at the beginning
    of section 5.12 before the description of individual pseudo-elements.

    Wherever this text is, I suggest that it include text such as
    "However, note that not all properties are applicable to pseudo-elements:
    see the descriptions of the individual pseudo-elements for what
    properties apply to them."  This wording assumes that the descriptions
    for :first-letter and :first-line are self-contained (i.e. don't need
    to be interpreted in conjunction with the Applies-to field),
    and that for any other pseudo-elements (just :before and :after in CSS2.1)
    it's practical to state explicitly that the properties that apply are those
    given in the relevant Applies-to field but treating the pseudo-element as
    if it were a real element, such that "block-level elements" includes any
    :before and :after pseudo-elements whose computed value of 'display' is
    'block' or any of the others that define block-level elements.

    As for where such a description of pseudo-elements generally should go:
    I suggest changing section 5.12 such that it starts with something like
    "To make the exposition concrete, we use the example of the :first-line
    pseudo element, described in more detail in section 5.12.1." (with that
    last "5.12.1" being a hyperlink).

    (Whatever wording is used, please try to make it such that the normative
    behaviour doesn't depend on the example, that the example helps cognition
    but that the description is correct and complete without the example:
    otherwise it might not be clear how to adapt the case for that example to
    other cases.)

    Then move most of the content of 5.12.1 to that first part of 5.12, where
    5.12.1 retains its existing content up to before "However" (while 5.12 can
    either duplicate that portion or slightly simplify it), while the text from
    "The :first-line pseudo-element can only be attached" to the end of that
    section would be in 5.12.1 alone.

As noted above, I think it's also worthwhile being a bit clearer about what's meant
by the containing block of a line box.  I suggested that one way of doing this would
be to add a sentence to section 10.1.  It looks like the place it would fit best
would be a new sentence before "The containing block of an element is ...".
In particular, I'd be inclined not to add the new sentence as an item in that list:
not just because the existing text says that the list is only relevant to elements
[I believe this text should be s/element/box/ anyway], but also because we don't
want to give the impression that line boxes are at all similar to the usual CSS
boxes described in box.html.

pjrm.

Received on Wednesday, 11 August 2010 07:23:44 UTC