Re: New Year. Renewed activity? New Chair? from Patrick Gundlach on 2013-02-07 (public-ppl@w3.org from February 2013)

From: Patrick Gundlach <gundlach@speedata.de>
Date: Thu, 7 Feb 2013 21:09:53 +0100
To: public-ppl@w3.org
Message-Id: <998C7F17-3079-4C69-9A75-8B3BC5205AEA@speedata.de>

Hello Tony and all others,

I am a member of this list and thus part of the inactivity. I don't think _I_ need a new Chair, I think I need some more time for discussion.

My business is in the "xml to pdf on paged media" area, so I have joined this list a few months ago. I have written an open source software and I'd like to share my approach with this list to get comments.

I have waited until now, because every part of the documentation was still in German - and probably not of any help at all except for a few people here. Now the English translation is getting better and better (still 90% missing) and I think I can start collecting comments as the documentation gets updated.

The kind of documents I deal with is product catalogs and other applications that need much optimization during the layout process.

* Can we fit another product onto the page by rearranging the products that are already on the page?
* If we reduce the font size, we will be able to fit the text on one page.
* The tables and images should be positioned so that the text in all given languages easily fit in the place above the table/image.

Many applications require dynamic layout optimization.

The problem I see with XSL-FO is mostly that the layout is more or less fixed before rendering. It is hard to get information back from the renderer to the place where you decide on the actual layout of the page. Why is this a problem? If you have a chunk of text and you need to be sure that it has at most some given dimensions, you can only find out about the size of the text if you really typeset it. -- Now my approach is to interpret the layout instructions _inside_ the renderer so you can always ask the renderer to typeset something on a virtual page and find out about the exact dimension. To do this, I have separated the layout instructions (XML) and the data XML file.

The layout instruction file is a mixture of different kind of "standards" and some programming constructs. The table model stems from HTML, the access to the XML data is very similar to XPath (2.0), the node walking has a few similarities with XSL, we optionally use CSS for some elements (more will follow) and we have variables/loops and if-then-else conditionals.

I have lots of ideas on how to improve the current limitations in my implementation, but the lack of time prohibits the complete implementation of all features above. For example the XPath support is limited to a few functions and a few operators yet, more will be included if I have time (and money) for that.

Even with the limitations, my software is used in production for various clients, so I can promise that it is very usable, even with the mentioned unperfectness.

I would like to hear any opinions on that matter. I'll be off to xmlprague friday-monday, and because of the big holes in the documentation it might be early to start this discussion now. But anyway, I just feel it's time for this email.

Some background: I've implemented this in Lua and the backend is LuaTeX, a modern variant of TeX, the typsetting system. My software uses a special mode of LuaTeX in which I can write the PDF directly without generating any TeX code, but still take full advantage of the things TeX offers: great line breaking algorithm, font and image reading and other layout helpers (glue, boxes, ...)

Now some links:

The project page: http://speedata.github.com/publisher/
Manual: http://speedata.github.com/publisher/manual/index.html
Github page: https://github.com/speedata/publisher
Installation instructions: https://github.com/speedata/publisher/wiki
Sample layout file: http://speedata.github.com/publisher/manual/examples/en/planets/planets-layout.xml
The 'data file' for the layout above: http://speedata.github.com/publisher/manual/examples/en/planets/planets-data.xml

Patrick

speedata UG (haftungsbeschränkt)
-------------------------------------
Telefon 030/57705055
Mobil 0178/1967142
Mail gundlach@speedata.de
Web www.speedata.de

Eisenacher Straße 101
10781 Berlin
-------------------------------------
Amtsgericht Charlottenburg HRB 135360 B
Geschäftsführer: Patrick Gundlach
USt-IdNr: DE278023065

Received on Thursday, 7 February 2013 20:28:40 UTC