FO->Area as XSLT extension function [Was: Customer requirement, a critque]

On Fri, March 8, 2013 1:34 am, Arved Sandstrom wrote:
> On 03/07/2013 09:33 AM, Tony Graham wrote:
...
>> All we need now is someone to whip up a proof-of-concept extension
>> function for a XSLT processor that takes a tree of FO nodes and returns
>> a
>> tree of area tree nodes and demonstrate it being called more than once
>> in
>> one transform.
>>
>> The low-hanging fruit would probably be FOP and either Saxon or Xerces.
...
> I can take a look at FOP in this regard, dunno if anyone currently
> associated with FOP is here and/or has the cycles to take this on. Can't
> say I've looked at it for years, but I deal with Java most every day,
> and I did work on FOP way back when.
>
> Can't say I am entirely clear on what's being described here. Maybe it's
> just that I've been waking up at 4 AM every day this week. :-)

Ouch.

> I have a mental model of how things usually work now: (1) run XSLT on
> XML and get FO, (2) run formatter on FO and get rendered PDF or
> whatever. Internals of formatting, including area tree, are black-box
> [sort of].

That is the 'classical' view of XSL-FO processing [1].

> I said sort of, because FOP (presumably among other implementations) can
> output an XML representation of the area tree (or one could implement a
> new Renderer, which has access to the area tree). Do we want to start
> here, with access to FOP's area tree after a complete initial pass
> through the FO? If so, I understand from the language above that this
> entire first formatting pass is to be encapsulated in an XSLT extension
> function? Yes?

Almost.  You mightn't necessarily process the entire FO through the
extension function: it might be everything or it could just be
something(s) for which you want to get the rendered size so the XSLT can
then decide what to put in the final FO.  Jirka's "actually this is quite
common workflow" email [2] is about processing the area tree of a complete
document, but customer requirement #10 (currently near and dear to my
heart) contemplates processing just the tables in the source, but
processing each multiple ways, just to decide their size and orientation
in the final FO.  But, again and as you restated earlier [6], we're
looking for a tool able to solve problems, not specific solutions to
problems a, b, and c.

What's returned from the function should be immediately useful in the XSLT
stylesheet for making decisions about what to put in the final FO:

 - Returning the text of the XML file for the area tree
   as a string would require some sort of 'evaluate()'
   function to turn it into nodes so wouldn't be immediately
   useful

 - Returning the area tree XML as a document node would
   allow using XPaths on the area tree to find things and,
   hopefully, would allow use of key() for quick lookup

 - Returning the URL for the area tree XML on disk that
   could be read in with the 'document()' function would,
   it seems to me, to be sufficient for a proof-of-concept

 - It's fine if a PoC returns the processor-specific area tree
   but it would be better if there eventually was an open,
   neutral XML format for area trees [4], but that also
   requires a common understanding of what goes in an area
   tree [5]

And no need to modify the XSL 1.1 spec to be able to do it.  A PoC would
give us an idea whether it's able to solve problems or whether we still
(or as well) need the expanded FO expression language contemplated by the
XSL FO 2.0 requirements [7] or an API for providing feedback a different
way.

Regards,


Tony.

[1] http://www.w3.org/TR/xsl11/#d0e147
[2] http://lists.w3.org/Archives/Public/public-ppl/2013Feb/0088.html
[3] http://www.w3.org/community/ppl/wiki/CustomerRequirements
[4] http://lists.w3.org/Archives/Public/public-ppl/2013Feb/0090.html
[5] http://lists.w3.org/Archives/Public/public-ppl/2013Feb/0045.html
[6] http://lists.w3.org/Archives/Public/public-ppl/2013Mar/0005.html
[7] http://www.w3.org/TR/xslfo20-req/#N66708

Received on Friday, 8 March 2013 14:23:49 UTC