Re: Cross-References in GCPM (was: CSS Pages and Pagination) from Sanders Kleinfeld on 2015-08-10 (public-digipub@w3.org from August 2015)

From: Sanders Kleinfeld <sanders@oreilly.com>
Date: Mon, 10 Aug 2015 13:03:23 -0400
To: Brady Duga <duga@google.com>
Cc: Johannes Wilm <johanneswilm@vivliostyle.com>, Håkon Wium Lie <howcome@opera.com>, Daniel Glazman <daniel.glazman@disruptive-innovations.com>, public-digipub <public-digipub@w3.org>
Message-ID: <CAD1Cp0vT5ZRh8naJSctw+deq_bR5x7suwLLY-Y7Yyr+sdp=UYg@mail.gmail.com>
Thanks so much for the great feedback, Brady and Johannes. Really
appreciate hearing your points of view.

The Paged Media spec already supports cases where "text from a target
is used as the text for a reference", via the target-text function
(http://www.w3.org/TR/css-gcpm-3/#target-text). What I was proposing
was just an additional flavor of this, which seemed like it might be
in scope based on precedent.

Brady, I'd love to hear more on why you see it as a bad idea. You're
correct that from a pragmatic perspective, I already have a custom
solution outside CSS that involves postprocessing XHTML source content
with XSLT before styling with Paged Media CSS, but it's somewhat
inelegant, and is the exact sort of thing we were trying to get away
from when we made the decision to move from XSL-FO to CSS Paged Media
as the basis for our print publishing program several years ago. IMO,
more advanced cross-reference capabilities in Paged Media would
continue to improve the spec such that more publishers might consider
it a viable alternative to XSL-FO.

Brady, also to respond to this point from earlier:

> Moving content into styles like this makes other automated processing of the content harder (or at least more expensive).
> For instance, a search for "figure 1.1" across all the books in my library will now require loading all chapters of all books into a UA, instead of just using an xml or html parser to find the text.

I don't consider text like "Figure 1.1" to be content; I consider it
to be a label. As such, I'm just as opposed to hardcoding it in HTML
(which is why we currently do it in a postprocessing step to keep the
source clean) as I am to hardcoding numerals in an ordered list, e.g.:

<ol>
  <li>1. First Step</li>
  <li>2. Second Step</li>
  <li>3. Third Step</li>
</ol>

Sounds like you'd agree about the "1.1" part, at least. I agree
there's a fuzzy area about using HTML vs. CSS for generated text
content, but again, this precedent already exists outside of Paged
Media via ::before and ::after pseudoelements used in conjunction with
the "content" property. Pragmatically, it cuts down on repeating label
text, and also opens the door to things like easy translations of
content to other languages, e.g. swapping out:

a.xref::before {
  content: "Figure";
}

for:

a.xref::before {
  content: "Figura";
}

When applying a stylesheet for a Spanish translation.

Thanks,
Sanders

On Mon, Aug 10, 2015 at 12:35 PM, Brady Duga <duga@google.com> wrote:
> Yes, counters are a different case, for which there are some solutions
> today. But I was specifically addressing the use case where text from a
> target is used as the text for a reference. Even if technically feasible I
> see it (in general) as a bad idea.
>
> On Mon, Aug 10, 2015 at 9:18 AM, Johannes Wilm
> <johanneswilm@vivliostyle.com> wrote:
>>
>>
>>
>> On Mon, Aug 10, 2015 at 4:57 PM, Brady Duga <duga@google.com> wrote:
>>>
>>> Looking at this specific use case, why is it better to do this at render
>>> time, instead of during content processing/creation? That is, it seems like
>>> you have an existing working solution, what problem are you trying to solve
>>> by moving this into CSS? I ask because this really seems to cross the
>>> styling/content barrier, moving what seems to be entirely content into
>>> stylesheets, and doesn't seem much like an edge case (I can't see a
>>> plausible argument for this being stylistic). Moving content into styles
>>> like this makes other automated processing of the content harder (or at
>>> least more expensive). For instance, a search for "figure 1.1" across all
>>> the books in my library will now require loading all chapters of all books
>>> into a UA, instead of just using an xml or html parser to find the text.
>>
>>
>>
>> If the cross reference includes page numbers, one cannot really know what
>> these will be before having laid out the text. This can potentially also
>> mean that certain parts need to be rerendered several times. For example:
>>
>> Say you want on page 90 of a book, with a lot of graphs and figures, you
>> want to refer to a graph that is on page 99. First the rendererer lays out
>> everything entirely without adding page numbers. It then determined the page
>> number of the graph, and adds this to page. The reference text could for
>> example be something like "figure 23: 'Linguistic dialects in pre-Colombian
>> Mesoamerica', p.99", but when that has been inserted, pages 90- have to be
>> redrawn and it turns out that now the figure has been moved on to page 100,
>> so the original text is being updated to "figure 23: 'Linguistic dialects in
>> pre-Colombian Mesoamerica', p.100", but unfortunately that extra digit in
>> the page number means that now it is being pushed on to p. 101, etc. .
>>
>> Eventually the page number will likely stabilize (with exception of
>> certain edge cases), but before that it may potentially involve quite a few
>> redraws of large parts of the content, and I wonder if the browser vendors
>> would be interested in putting this into their engines or whether they see
>> it as a pure book/scientific journal feature that they don't feel they
>> need/want to support.
>>
>> It may still make sense to describe this in terms of CSS, but then have
>> JavaScript interpret that CSS to do the actual layouting of that part. I say
>> "may", because it may also be overstretching the purpose of CSS. For
>> citations on the web, for example, the main project I am aware is the CSL
>> (Citation Style Language)[1], which is based on rather than CSS. It would
>> probably make sense to have a field test trying the CSS-approach based on a
>> JavaScript polyfill before committing to any CSS or XML-based spec on this.
>>
>> [1] http://citationstyles.org/
>>
>
Received on Monday, 10 August 2015 17:03:54 UTC