FO: Requirements from Recent Customers/Prospects

ISOGEN has started doing a good bit of XSL FO development as well as
working with existing and potential customers on evaluating the
applicability of XSL FO to their existing documents. Out of this work
we've started to gather a list of requirements that cannot currently be
met with XSL 1.0. I reviewed the xsl-editors list back to Jan 2002 and
didn't see any of these mentioned (except for the indexing support), so
I thought I would list them here. 

We are currently working with two main types of documents: technical
manuals (e.g., user guides, reference manuals, aircraft maintenance
manuals, etc.) and serials (STM journals, designed magazines, etc.). We
are also starting to do some work with traditional book publishers, but
so far haven't found any requirements there that we can't meet (although
I'm sure there are some).

In addition to the composition-related requirements outlined below, we
also have a general requirement for getting feedback from the pagination
stage back to the initial FO generation stage--if there was a standard
way to do this we could develop, for example, multi-pass processes that
could be conditioned on page layout results in a way that FO processes
cannot be today except through proprietary extensions. It would also
enable the development of, for example, loose-leaf publishing solutions.
It might be as simple as standardizing the serialized representation of
area trees or it might require the definition of some sort of
input-structure to result page mapping format.

-----------------
Technical Manuals

For technical manuals, the unmet requirements I've identified are:

Hard Requirements:

- Collapsing of sequences of page number citations to unique numbers
(same requirement submitted by David Pawson). Both XSL Formatter and XEP
provide extensions to do this now and the lack of it is the only barrier
to automatically producing back-of-the-book indexes with FO. Note that
this has to work even if each page-number-citation is contained within a
basic-link (the normal practice for generating linked PDFs).

- Page position-sensitive headers/footers for tables. For example, I
need to be able to generate headers like "Table 1 (cont.)" on 2nd and
subsequent pages of a multi-page table. This appears to be impossible in
XSL 1.0. I think this requirement could be met by allowing marker
references in table headers and footers, for example.

- "Revision bars". Must be able to generate marks that are positioned
relative to the position of inline areas and that have the same
block-progression- or inline-progression-direction extent as the inline
area. I think it would be sufficient for these to be generated within
the same region as the inline area, but the best solution would allow
the marks to be in one of the non-body regions. I believe all the FO
implementation vendors are developing or have developed proprietary
solutions to this requirement.

Nice-To-Haves:

- Direct support for generating PDF links and annotations. I need to be
able to generate PDF bookmarks, in particular, but possibly other types
of annotations. While basic-link naturally translates to PDF link
annotations, there is no obvious way to generate bookmarks or other
annotations in a standard way. All the FO implementations provide
extensions to generate bookmarks, for example. There ought to be a way
to abstract and generalize the notion that PDF annotations represents.
For example, it might make sense to have a special region or flow type
that is used for these types of online features or there might be a way
to define the interpretation of existing online structures, I don't
know. But the lack of standardization in this area is a significant
barrier to practical interoperation because I think pretty much all uses
of FO will be used, in part, to generate PDFs intended for online
delivery. It also seems unlikely that an alternative page-based online
delivery format will be developed as an alternative to PDF any time soon
(seems more likely that PDF will be fully standardized, although I'm not
holding my breath for that either).

- Flowing of text around floated or absolutely-positioned areas. For
technical manuals this is a nice to have--nobody using an SGML-based
composition system can really do this today (except maybe with
Frame+SGML), so most don't do it, but you do run into some more heavily
designed manuals that are not currently SGML or XML based.  But this has
not been an issue for our technical manual clients or prospects so far.

------------------
Designed Documents

Obviously, designed magazines (magazines with arbitrary page layouts
with lots of design elements) are the biggest challenge--they require
page layout functions that XSL 1.0 explicitly choose not to address (and
good thing to or we might never have seen a spec :-). However, we are
seeing two classes of designed magazines: those that require arbitrary
flows across disconnected areas and those that do not. For our current
serial-producing customer, a fairly typical STM
(scientific/technical/medical) publisher, their magazines fall into the
second category: each article or department is rendered as a single
contiguous page flow that could be rendered by FO "if only". This
suggests that there is a middle ground between the current FO area model
and a completely arbitrary flow-to-area model that would satisfy the
requirements of a large number of magazines without being too complex to
either define or implement. [This enterprise, a non-profit STM
publisher, already uses XML for its non-designed journals and wants to
use XML for its glossy magazines as well in order to, for example, lower
the cost of serving the magazine content in HTML and to resolve on a
single technology base for all its publications. They currently use DTP
tools to do glossy magazine layout and production.]

For these middle-ground serials, the unmet requirements that I've
identified so far are:

- Ability to flow text around rectangular floated or absolutely placed
areas. For example, I might have a two-column layout where the first
page has a graphic positioned in the middle of the page so that it
intrudes vertically into the two columns:

        .-----.
        |     |
 .----. | img | .----.
 |    | '-----' |    |
 |    '---. .---'    |
 |        | |        |
-----------------------

- More fine control of column filling and balancing. For example, I need
to be able to ballance two columns of a three-column layout but then do
something else in other (don't have an example to hand). My
understanding is that currently you can only balance across all the
columns of a region.

Given these two features, I think I could otherwise replicate serials of
these types, as long as all graphic elements are rectangular (that is,
no irregularly-shaped design elements around which text would be
flowed).

For fully-designed magazines I would of course need the following
features (which I don't really expect XSL to step up to in the next
revision but I think it's worth capturing the requirements, because they
might end up being easier to do than it appears):

Hard Requirements:

- Ability to apply flows to arbitrarily-placed areas (e.g., articles
continued from page 4 to the page 55). 

- Ability to flow text around arbitrarily shaped areas.

Nice-To-Have:

- Automation of layout and page tuning that is currently done largely by
hand. This is mostly copy fitting to make X amount of content fit within
a specific amount of space. For example, it would be ideal if one could
provide hints like "max-page-length='10'" on a block or block-container.
That, coupled with lower-level constraint hints, might make it possible
for a composition engine to do most or all of the layout tuning work.
Then, given an interactive FO-based composition tool (imagine Quark or
InDesign using an FO area tree under the covers) one could do the final
design tweaking, as one does today with these unstructured design tools.
I have no idea how practical or realistic such a vision is, but it seems
to me like it ought to be doable.

Cheers,

Eliot
-- 
W. Eliot Kimber, eliot@isogen.com
Consultant, ISOGEN International

1016 La Posada Dr., Suite 240
Austin, TX  78752 Phone: 512.656.4139

Received on Sunday, 22 September 2002 10:59:15 UTC