Re: FO->Area as XSLT extension function [Was: Customer requirement, a critque]

On 03/08/13, Tony Graham  <tgraham@mentea.net> wrote:
> On Fri, March 8, 2013 1:34 am, Arved Sandstrom wrote:
> > On 03/07/2013 09:33 AM, Tony Graham wrote:
> ...
> >> All we need now is someone to whip up a proof-of-concept extension
> >> function for a XSLT processor that takes a tree of FO nodes and returns
> >> a
> >> tree of area tree nodes and demonstrate it being called more than once
> >> in
> >> one transform.
> >>
> >> The low-hanging fruit would probably be FOP and either Saxon or Xerces.
> ...
> > I can take a look at FOP in this regard, dunno if anyone currently
> > associated with FOP is here and/or has the cycles to take this on. Can't
> > say I've looked at it for years, but I deal with Java most every day,
> > and I did work on FOP way back when.
> >
> > Can't say I am entirely clear on what's being described here. Maybe it's
> > just that I've been waking up at 4 AM every day this week. :-)
> 
> Ouch.
> 
> > I have a mental model of how things usually work now: (1) run XSLT on
> > XML and get FO, (2) run formatter on FO and get rendered PDF or
> > whatever. Internals of formatting, including area tree, are black-box
> > [sort of].
> 
> That is the 'classical' view of XSL-FO processing [1].
> 
> > I said sort of, because FOP (presumably among other implementations) can
> > output an XML representation of the area tree (or one could implement a
> > new Renderer, which has access to the area tree). Do we want to start
> > here, with access to FOP's area tree after a complete initial pass
> > through the FO? If so, I understand from the language above that this
> > entire first formatting pass is to be encapsulated in an XSLT extension
> > function? Yes?
> 
> Almost. You mightn't necessarily process the entire FO through the
> extension function: it might be everything or it could just be
> something(s) for which you want to get the rendered size so the XSLT can
> then decide what to put in the final FO. Jirka's "actually this is quite
> common workflow" email [2] is about processing the area tree of a complete
> document, but customer requirement #10 (currently near and dear to my
> heart) contemplates processing just the tables in the source, but
> processing each multiple ways, just to decide their size and orientation
> in the final FO. But, again and as you restated earlier [6], we're
> looking for a tool able to solve problems, not specific solutions to
> problems a, b, and c.
> 
> What's returned from the function should be immediately useful in the XSLT
> stylesheet for making decisions about what to put in the final FO:
> 
>  - Returning the text of the XML file for the area tree
>  as a string would require some sort of 'evaluate()'
>  function to turn it into nodes so wouldn't be immediately
>  useful
> 
>  - Returning the area tree XML as a document node would
>  allow using XPaths on the area tree to find things and,
>  hopefully, would allow use of key() for quick lookup
> 
>  - Returning the URL for the area tree XML on disk that
>  could be read in with the 'document()' function would,
>  it seems to me, to be sufficient for a proof-of-concept
> 
>  - It's fine if a PoC returns the processor-specific area tree
>  but it would be better if there eventually was an open,
>  neutral XML format for area trees [4], but that also
>  requires a common understanding of what goes in an area
>  tree [5]
> 
> And no need to modify the XSL 1.1 spec to be able to do it. A PoC would
> give us an idea whether it's able to solve problems or whether we still
> (or as well) need the expanded FO expression language contemplated by the
> XSL FO 2.0 requirements [7] or an API for providing feedback a different
> way.
> 
> Regards,
> 
> 
> Tony.
> 
> 
> 
OK, this is helpful, thanks.


No reason we can't return the complete document area tree *and* focus on your problem of choice, one of the test documents can do almost nothing else except have your table in a number of different sized pages.


Agreed on having a common format area tree. I'll keep that in mind as I code stuff up. Should also be able to do returning the area tree as a document node and/or returning the URL for the XML.


This is a good start, it's not overly ambitious and it helps work out requirements.


Arved



> 

Received on Friday, 8 March 2013 16:55:35 UTC