Re: [gcpm] coalescing sequences of numbers for cross-references, back-of-the-book index etc from Tab Atkins Jr. on 2015-12-17 (www-style@w3.org from December 2015)

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Thu, 17 Dec 2015 14:32:19 -0800
To: "Liam R. E. Quin" <liam@w3.org>
Cc: www-style list <www-style@w3.org>
Message-ID: <CAAWBYDANNszRifu1+g4-Or1-cUAyt8cfk29Ga7m98RCeZeZGEQ@mail.gmail.com>

On Wed, Dec 16, 2015 at 9:20 PM, Liam R. E. Quin <liam@w3.org> wrote:
> On Wed, 2015-12-16 at 15:01 -0800, Tab Atkins Jr. wrote:
>>
>> Right, but you haven't proposed a mechanism for *finding* the page
>> numbers yet.
>
> We already have such a mechanism in CSS, in gcpm:
>   content: target-counter(attr(href, url), page)
>
> I omitted that since I don't need to propose it. A full solution would
> be using something like
>
> a::::before  {
>   content: target-counter(attr(href, url), page)
> }

Yeah, using the actual CSS feature would have made the post a little
less confusing to me. ^_^  Also would have been nice to see the
properties you propose in actual use

> There's a minor detail to resolve here relating to pages whose
> formatted numbers are not arabic numerals (e.g. prelims with i, ii, iii
> etc, or an apendix B-1, B-2) and the formatter needs to work with the
> value of the global page counter rather than the formatted string.

This is why working with counters works so well - they have numeric
values and then are just *printed* into some string representation,
controllable by the page author.  That lets us avoid all of the issues
with recognizing and parsing numbers.

The big issue to me is still figuring out how to handle merged
elements.  As proposed, multiple DOM elements still get merged in
display, and it's unclear how it's supposed to determine which element
the text "ends up in".  The first in DOM order, maybe?  Or perhaps
this should operate *purely* in CSS, like as a variant counter()
function that, when used on an element (/pseudo-element) looks at the
children of the element and synthesizes a combined string for it.

Something like this, maybe:

<dl class="index">
  <dt>Forehead, skin of</dt>
  <dd>
    <a href="#book1-chap-xv-para5">71</a>
    <a href="#book1-chap-xlii-para5">204</a>
  </dd>
  <dt>Fortunes (astrological)</dt>
  <dd>
    <a href="#book2-chap-vi-para-16">250</a>
    <a href="#book2-chap-l-para-2" class="range-start">402</a>
    <a href="#book2-chap-l-para-3">403</a>
    <a href="#book2-chap-l-para-7" class="range-end">403</a>
    <a href="#book2-chap-l-note-5" class="to-note">404</a>
  </dd>
</dl>

.index a {
  display: none;
  index: page attr(href, url) page;
  // Stores the number 205 or whatever in the element's "index"
quality, with the name "page".
  // Grammar is <index-name> <url> <target-counter-name>
}
.index dd::before {
  content: indexes(page, '-', ', ', decimal);
  // grabs all the "page" indexes from the children, sorts them by value,
  // collapses identical numbers or ranges, and formats them according
to the arguments.
  // This produces "71, 204" and "205, 402-404" for the two <dd>s.
}

This kills the hyperlinks, of course, but your solution definitely
kills *some* of the hyperlinks, at minimum.

Really tho, the confusing part here is that there's an intermediary
between "the thing" and "the collected index" - those manually-created
<a> elements pointing at the things.  If we cut that out this gets
simpler:

<p>And on Tuesday my <span data-index="Fortunes
(astrological)">horoscope</span> said I shouldn't go outside.
...

<dl class="index">
  <dt>Fortunes (astrological)</dt>
  <dd data-index-gen="Fortunes (astrological)"></dd>
</dl>
<style>
[data-index] {
  index: attr(data-index) page decimal;
  // 'index' grammar is <index-name-string> <counter-name> <counter-style>
}
[data-index-gen]::before {
  content: index(attr(data-index-gen), '-', ', ');
  // index() grammar is <index-name-string>, <range-string>, <separator-string>
}
</style>

~TJ

Received on Thursday, 17 December 2015 22:33:07 UTC