W3C home > Mailing lists > Public > www-style@w3.org > August 2014

RE: formatting a back-of-the-book index with CSS (no JavaScript)

From: Jan Tosovsky <j.tosovsky@email.cz>
Date: Sat, 23 Aug 2014 23:40:49 +0200
To: <www-style@w3.org>
Message-ID: <036201cfbf1a$e8c56820$ba503860$@tosovsky@email.cz>
On 2014-08-23 Liam R E Quin wrote:
> 
> An index will typically have entries like
>    Boats, canal 17, 19-25, 38
> You need the formatter to collapse ranges like 19-25 based on when
> items appeared on consecutive pages, and also for the formatter to
> remove duplicate numbers if two items appear on the same page after
> formatting.

I hope this collapsing will be optional. In my case I encode the range (via
IDs) directly into the source (docbook). I expect the rest of cases will be
listed separately.

I distinguish (but who else cares?):
primary 9-10  (the term is discussed thoroughly within this range)
primary 9, 10 (there are individual occurrences on every particular page)

I also expect those directly specified ranges ending up as 10-10 merged just
into 10.

> I wrote a short blog entry on this:
> http://barefootliam.blogspot.ca/2014/08/back-of-book-indexes-and-
> css.html

A nice overview. 

I'd like to add a note to column balancing. When it is employed, there is
another constraint to cope with to meet all typographic rules:
(1) Page register (the last line should be placed at the same position on
every page) 
(2) Orphan/Widow (no single line should be located at the end/top of the
page)
(3) Balanced columns

When e.g. the first column ends with the letter followed with the first
entry (primary):
- should this block be overflown to the next column? Should the next column
be balanced? If so, this page won't meet the page register.

Another example. Let's imagine the index:

P
primary
~ secondary01
~ secondary02

Q

Can this structure be broken after the primary?
Can this structure be broken after the first secondary?
If there is a very long page reference list broken into several lines, can
this list be broken in between?

I don't think all this can be fully automated and there should be some way
how to intervene into the formatting process (processing instructions?).


Last few bits. It is not about the style...

I am not so optimistic about the automatic sorting. E.g. Czech rules define
different sort order for the first and the second letter. In specific
combinations up to 4 passes have to be performed to be sure that the order
is correct. A default 'czech' sorting in common SW cannot be used for
professional use. ... But sadly, often it is :-( 

When talking about sorting, we should also support aliases (=sort as) for
index entries. 

Best Regards, 

Jan
Received on Saturday, 23 August 2014 21:41:15 UTC

This archive was generated by hypermail 2.4.0 : Friday, 25 March 2022 10:08:45 UTC