Re: [css-gcpm] String-set issues

On Sat, Nov 22, 2014 at 1:18 PM, Brad Kemper <brad.kemper@gmail.com> wrote:
> I've been giving some attention named strings in the latest GCPM draft [1].
> I have some issues and comments. I'll start with the things I think are
> problems, and then I'll propose how I'd like to see it improve.
>
> Problems:
>
>  1.  'string-set' is a weird and confusing property name. It sounds like it
> is for a set of something, but I think really 'set' is meant to mean that
> you are assigning a (string) value to a name. We don't do that in other
> places where we have named things. For instance, we have '@counter-style',
> not  '@counter-style-set'; '--foo', not 'foo-set', the 'font-family'
> descriptor of @font-face, not 'font-family-set', etc.

I agree it's an odd name, which has been around for at least eleven
years. One of the challenges of GCPM is that most of it has been
interoperably implemented by Prince and AntennaHouse, but not by
browsers. At this point we've typeset thousands of books over the last
five years using this property for the running heads.

>
>  2.  Pseudo-elements are excluded from being able to use 'string-set', so
> the 'content()' function has its own way of accessing them. That seems
> unnecessary. Just let pseudo-elements assign their contents to a named
> string, if you need that. It is simpler to learn, and less complicated,
> because you just use selectors like you normally do with pseudos. And
> really, most of the time, you won't even need that, because the actual
> element has access to the counter too, and that can be assigned to a name
> there. This would also eliminate the need for parentheses after the keyword
> 'content'.

Makes sense. I'm curious about why it was originally done this way.

>
>  3.  'string-set' only gets the text of the element. I would think there are
> times when it would be useful to get all the content nodes, including links,
> bolds, small pictures, spans with class names, etc.

Absolutely, this is a huge limitation. More on this topic below.

>
>  4.  When 'string()' is used to access a named string, it ignores what was
> set on assigned to that name for elements not on the page. But in
> CSS3-content [2], the examples include 'META[author] { string-set: author
> attr(author); }', which selects something completely outside the page and
> uses it. Is this because the two drafts are out-of0synch, or are all these
> examples supposed to work somehow?

Out-of-synch is an understatement; the WD of CSS3-content is from
2003. I'm in the midst of cleaning that up, and moving this section of
GCPM into that spec.

The default value (first) does use the first assignment on the page,
if there is an assignment on that page. Otherwise it uses whatever
value happens to be in effect, which could have been set a thousand
pages earlier. We do this all the time, to copy book titles and author
names from the book title page to everywhere else in the book:

div.title-block-book-rw h1 { string-set: book-title content(); }

>
>  5.  Normally when we assign a value to some sort of name in CSS, it is one
> value that is globally available. But with named stings in this draft, every
> name has multiple values (one for each selector-matched element within each
> page, multiplied by the number of pages, since the name only has page-wide
> scope in all the examples of this draft), so that the string() function can
> access right element, based on its position on the page. For instance, with
> 'string(theName, first)', the "value of the first assignment on the page is
> used". Meaning, I think, that if the element is the first occurrence of
> those on the page that match the selector, then it uses that element for the
> value assigned to that name. I think it would be better to just use
> selectors and pseudo-classes to select the right element, rather than to
> have that take place within the function inside the property value.
>
>  6.  In the 'string()' function, the 'start' keyword is not well defined. It
> says "If the element is the first element on the page". Is that by document
> order? What if it has a parent? Technically, doesn't the parent come first?
> Also, I didn't really get what 'first-except' was for.

All of these keywords were designed around some common use cases in
books, especially things like dictionaries. The running head will
often show the first entry on the left page, and the last entry on the
right page. But say an entry starts at the bottom of the previous
page, and continues on the current page. Sometimes you want whatever
entry is being discussed at the top of the current page ('first'), but
sometimes you want the first entry that actually begins on that page
('start').

I'll work on the definition of 'start'. We're not so much interested
in document order as in what element's content box starts the page.

first-except is very handy for things like chapter titles, where you
don't want running head content on the first page of that chapter. I'm
guessing that many of the spec values are designed to solve the most
common use cases for dictionaries and things like that.

>
>  7.  In the 'string-set', <content-list> can be used to construct a string
> from the text contents of the element (using 'content()'), as well as from
> literal text, counters, and attributes, and assign it to a name. But then,
> when you want to use a string like that, there is the 'string()' function.
> With that, you get that named string and construct a string from it and from
> literal text, counters, and attributes, and assign it to the 'content'
> attribute. This seems unnecessarily redundant, which can make it confusing.
> Since 'content' can already string together text from multiple sources, we
> don't really need to do it before assigning it to a name too, do we? If we
> need multiple things from the original element (its text and one of its
> attributes and its counter, for instance), they can just be each assigned to
> individual names, which can then be pulled into 'content' to be strung
> together. I think it is more author-friendly to just give us one place to
> concatenate strings together for the content, and let other properties and
> functions do their own things. Separation of concerns.

Agreed.

>
> Regarding the first point above, I think in some ways, 'string-set' is
> similar to 'flow-into', especially if we axe the concatenation stuff of #7
> above. They both use the similar 'content' or 'contents' keywords to create
> a variable-like name to hold the original contents of the element. I think
> we can play off that, using that as mental equity for changing the name and
> syntax of 'string-set'. So, instead of 'string-set', I propose the
> following:
>
> Name:   copy-into
> Value:  none |  [ [ <custom-ident>  <content-level>] [,  <custom-ident>
> <content-level>]*  ]?
> Initial:        none
> Applies to:     All elements, but not ::first-line or ::first-letter.
> Inherited:      no
>
> The 'copy-into' property contains one or more pairs, each consisting of an
> custom identifier (the name of the named string) followed by a content-level
> keyword describing how to construct the value of the named string.
>
> <ident> = The element or its contents, or its text, or the value of a
> specified attribute or counter is copied and placed into an non-rendered
> content fragment with the name '<ident>'. The values none, inherit, default,
> auto and initial are invalid content fragment names.
>
> <content-level> expands to one of the following values:
> element|contents|text|attr(<identifier>)|<counters>
>
> element
> the entire element is copied into the named content fragment (i'm using
> 'named content fragment' to mean the same thing as a named flow, but not
> intended to be flowed through multiple elements).
>
> contents
> only the element’s contents are copied into the named content fragment. This
> is the default if <content-level> is not specified.
>
> text
> only the element’s text (including normally collapsed white space) is copied
> into the named content fragment.
>
> attr(<identifier>)
> the string value of the attribute <identifier> is copied into the named
> content fragment
>
> <counters>
> the value of a counter() function, as described in [CSS21] is copied into
> the named content fragment.
> -----------------------------
>
> So basically, it is the same as "flow-into", except that it does not remove
> anything, just a copy, and it has some other choices besides just "content"
> and "element" (I would also change "content" to "contents" in Regions).
> Plus, it can list several different names to copy stuff into, e.g. like
> this: 'copy-into: myContents contents, chapNum counter(chapter)'. And it has
> 'contents' as the default if one of the other levels is not specified
> instead.
>
> By default, the content fragment name would be global, as the named flow is
> with 'flow-into'. But if one of the following pseudo-classes are used on the
> subject of the selector, then the name is locally scoped to just the page
> the element is on.
>
> :nth-of-page(n)    The element is the nth matched element on the page.
> :first-of-page       Same as :nth-of-page(n), but where n = 1 (it is the
> first matched element on the page).
> :last-of-page       The element is the last matched element on the page.
> :start-of-page      The element is the first matched element on the page,
> and neither it nor its ancestors have any previous siblings that appear on
> the page.
>
> The content property would be able to accept the named content fragment as
> one of its value parts, just by using the identifier. It would not be part
> of a region chain, unless the whole element containing the named content
> fragment had "flow-into" something else.
>
> So, for instance, here are Examples 1-3 of GCPM, re-written with this
> syntax:
> --------
> HTML:
> <h1>Loomings on the <b>Horizon</b></h1>
>
>
> CSS:
> h1::before { content: 'Chapter ' counter(chapterNumber); }
> h1:first-of-page { copy-into: headerP1 counter(chapter),
>                                               headerP2; }
> h1::after { content: '.' copy-into: headerP3; }
> @top-center {
>   content: headerP1 ": " headerP2 headerP3;
> }
>
> The value of the named string “headerP1” will be “Chapter 1", and the value
> of the named string “headerP2” will be "Loomings”. headerP2 will include the
> bold tags around "Horizon", because the <content-type> defaults to
> 'contents', not 'text'. The value of the named string “headerP3” will be
> ".”. The top-center content will be "Chapter 1: Loomings on the
> <b>Horizon</b>."
>
> ---------
> HTML:
> <section title="Loomings">
>
> CSS:
> section:first-of-page { copy-into: header attr(title) }
>
> The value of the “header” string will be “Loomings”, assuming that section
> intersected with the page.
>
> -----------
> CSS:
>
> @page {
>   size: 15cm 10cm;
>   margin: 1.5cm;
>
>   @top-left {
>      content: "first: " heading1;
>   }
>   @top-center {
>      content: "start: " heading2;
>   }
>   @top-right {
>      content: "last: " heading3;
>   }
>
>   @bottom-center {
>      content: "start: " author;
>   }
> }
>
> h2:first-of-page { copy-into: heading1 }
> h2:start-of-page { copy-into: heading2 }
> h2:last-of-page { copy-into: heading3 }
> META[author] { copy-into: author attr(author); }
>
> The rendered examples would be the same as in the spec, except that the
> author's name would appear at the bottom center of each page too.

We've started thinking along those lines in a very rough sketch of GCPM4 [1].

>
>
> Brad Kemper
>
>
> 1) http://www.w3.org/TR/css3-gcpm/
> 2) http://www.w3.org/TR/css3-content/#strings

Thanks very much for the comments! I will continue to study this, and
will respond in more detail to the proposal.

Regards,

Dave

[1] http://dev.w3.org/csswg/css-gcpm-4/#flow-policy-heading

Received on Monday, 24 November 2014 03:13:01 UTC