- From: Kaveh Bazargan <kaveh@rivervalleytechnologies.com>
- Date: Thu, 6 Aug 2015 16:40:01 +0100
- To: "Siegman, Tzviya - Hoboken" <tsiegman@wiley.com>
- Cc: Bill Kasdorf <bkasdorf@apexcovantage.com>, Johannes Wilm <johanneswilm@vivliostyle.com>, Dave Cramer <dauwhe@gmail.com>, Richard Ishida <ishida@w3.org>, W3C Digital Publishing Discussion list <public-digipub@w3.org>
- Message-ID: <CAJ2R9pj2omU3MBDBzMeqizsNBiMV1VpChRBrLWQHWVJhX0gP_w@mail.gmail.com>
Thanks for teaching me a new word: proselytize :-) I don't think you have quite grasped what I am saying, but I get the message. ;-) On 6 August 2015 at 16:25, Siegman, Tzviya - Hoboken <tsiegman@wiley.com> wrote: > While it is great to see so much discussion on our list, I feel that we > are having a religious argument not a technical one. > > > > Many people are happy to use TeX, and that is fine. Many people would like > to shift their workflows to HTML and CSS, and that is what we are focusing > on in the DPUB IG and CSS WG. > > > > Let’s try not to proselytize. > > > > > > *Tzviya Siegman* > > Digital Book Standards & Capabilities Lead > > Wiley > > 201-748-6884 > > tsiegman@wiley.com > > > > *From:* Kaveh Bazargan [mailto:kaveh@rivervalleytechnologies.com] > *Sent:* Thursday, August 06, 2015 11:16 AM > *To:* Bill Kasdorf > *Cc:* Johannes Wilm; Dave Cramer; Richard Ishida; W3C Digital Publishing > Discussion list > *Subject:* Re: Prioritisation > > > > Hi Bill > > > > As far as I know TeX is the only *open and free* engine that can handle > sophisticated pagination of complex text (e.g. floating elements, footnotes > that take more more than body text, complex math, highest level of > typography) with full automation. > > > > And I know I am missing something but I fail to see why we want to go > through several years of development to replicate the above > functionalities, inside the browser. (A back end TeX system converting XML > to PDF needs no maintenance or learning). > > > > On 6 August 2015 at 15:53, Bill Kasdorf <bkasdorf@apexcovantage.com> > wrote: > > Just pointing out that there are many sophisticated pagination engines out > there, TeX is just one example. Very sophisticated, automated, complex page > makeup has been done on proprietary systems since the 1990s. Several such > systems are still currently in wide use. Many of the issues that we're > addressing in the context of the Open Web Platform today were considered > solved problems decades ago in those systems. That doesn't mean we don't > want to be able to provide that kind of sophisticated page makeup natively > on the Web and in Web-based technologies, _*without*_ requiring separate > software systems (with their attendant specializations, learning curves, > system implementations, maintenance, etc.)—Bill Kasdorf > > > > *From:* Kaveh Bazargan [mailto:kaveh@rivervalleytechnologies.com] > *Sent:* Thursday, August 06, 2015 7:06 AM > *To:* Johannes Wilm > *Cc:* Dave Cramer; Richard Ishida; W3C Digital Publishing Discussion list > *Subject:* Re: Prioritisation > > > > Hi Johannes > > > > I am flattered by your comprehensive reply. My comments regarding TeX are > below, but I might not have explained myself well... > > > > I am not suggesting anyone should use TeX code, or even be aware that > TeX/LaTeX is involved. The point is that it is a back end automated page > make up engine. So XML/HTML can be converted to PDF very fast and at very > high quality with the TeX engine invisibly doing the work. > > > > Here are my points, distilled: > > - I like the idea of HTML/CSS/Javascript creating fixed pages to be > read on screen with all kinds of interactivity > - I still question trying to create footnotes, floating figures and > tables, and typographic niceties which have primarily evolved for print on > paper, being done in the browser. To me, floating items only apply to > print, so no interactivity is not needed. Why not pass the info to an > engine that knows how to do it well? > - The problem of floating items, complex math, large footnotes that > need to break across pages, and many other complex pagination problems have > already been solved in TeX. These are not trivial problems and I worry > about this working group reinventing the wheel, by starting to specify the > basics of pagination from scratch. In my opinion, in the end the only way > to solve the problem is to rewrite TeX in JavaScript! > - Another problem I have is holding all our information in HTML as > opposed to XML. I worry about how clean and semantic the content will be. > after all HTML was designed to be forgiving, so even bad content will look > good. We are all excited about the amazing gizmos in html and how the > browser is the new publishing model, but what about 10, 50 or 100 years > time? Will these html files still make sense? What happens when the browser > is superseded? I am all for html tools and interactivity, but I suggest the > definitive content should be XML, not HTML. > > > > On 5 August 2015 at 23:34, Johannes Wilm <johanneswilm@vivliostyle.com> > wrote: > > Kaveh's email just reach me now, so I have only seen other parts of the > discussion so far. > > > > On Tue, Aug 4, 2015 at 5:55 PM, Kaveh Bazargan < > kaveh@rivervalleytechnologies.com> wrote: > > Forgive me for a very basic question, but it is a devil's advocate type of > question. And if this is not the place to ask this perhaps you can direct > me to any relevant discussions. > > > > My very basic question is, why do we need to "paginate" in the browser in > the first place? Why not keep the browser for reflowing and interactive > text, which is what it is good at, and use a standard mark-up pagination > system (TeX/LaTeX would be my choice) to do what that is good at. If > another system has already solved problems like footnotes and floating > figures, what exactly is the drive to reinvent that in the browser? > > > > I am myself a LaTeX person and for a lot of things I would agree with you. > > > > However, there are some good reasons to do everything in browsers: > > > > A) You can have one source file for everything and don't need to do > conversion > > > > B) Epub is already tied to HTML, sousing LaTeX as the universal format > will likely not work in the long run > > > > C) Most people have a browser installed already, so you don't need to have > them install anything else on their machine > > > > D) Browsers running extra layout JavaScript can be made to render more or > less complex layout of the same sources. So far example you may say that > you just want to show the text and put the footnotes at the bottom in a > single parse. The layout will not be perfect, but on a mobile device that > will give you a quick result. But on a server that is to produce a PDF out > of the same source document, you can have it use a 7-parse process and add > kerning, microtyping, etc. > > > > E) LaTeX document editing is not exactly easy. Many of the LaTeX documents > I wrote 10-15 years ago I cannot simply parse using my current laptop with > the latest TeXLive installed. And most of those are just 5-10 page long > midterm papers for History, Literature or English language (so no advanced > formulas, just citations and plain text). For my books I tried to add a few > minor extras (such as a small flag icon that would be added before and > after the chapter titles), and when I need to rerender them after not > having rendered them for a year or two, I generally have to spend about a > day on various online discussion forums to try to figure out what has > changed in the latest versions of the renderers and how I can get around > those issues. I am not entirely sure, but I imagine that this would have > been easier had the sources been in HTML, as the renderer would at least > render everything that it did understand instead of the everything or > nothing approach of LaTeX. > > > > Actually TeX is the fastest page renderer. Standard TeX files create pages > at over 100 pages a second on a normal laptop, including complex math and > footnotes. And I am surprised you had problem running old files. You must > have been using style files which had not been maintained. The TeX engine > has been frozen for 30 years! > > > > But for this discussion most of that is irrelevant I think. > > > > > > I wonder if point D is entirely clear to everyone. When CSS features are > discussed, one of the most important points is of course whether browsers > will implement them. Features that are so complex that the rendering of the > contents of a page will take as long as it takes for a LaTeX renderer to > create a PDF will likely not make it, because speed is more important that > high feature level for browsers for which pages-based features are just a > side project. But some will need such complexity for rendering really great > looking output (for example for print output). > > > > From browsers probably the best one can ever expect is that they will > provide fast and simple page layout. But if one has the needed primitives > to allow for more complex solutions in browsers using JavaScript, then one > can still create those sites that spend 5 minutes on rendering the final > output. > > > > > > On Tue, Aug 4, 2015 at 8:03 PM, Kaveh Bazargan < > kaveh@rivervalleytechnologies.com> wrote: > > > > > > On 4 August 2015 at 18:50, Bill Kasdorf <bkasdorf@apexcovantage.com > > wrote: > > A quick clarification. I am quite sure that in her e-mail Deborah is using > the term "pagination" to mean "maintaining a record in the digital file of > where the page breaks occur in the paginated version of record." That's > essential to accessibility and other useful things as well (citations, > cross references, indexes, etc. in a world in which print is still > considered the version of record and references to its page breaks are > common.) That's not the same as making the _*rendered pages*_ in the > digital file replicate those in the print.—Bill K > > > > [...] > > > > > > But Bill, how do we make the page breaks in the electronic version to be > the same as those of the print pages unless we have the same elements and > layout? For instance if a floating figure is missing from an electronic > page, do we just make a short page and break where the paper copy breaks? > That would lead to very ugly results. > > > > > > The end device should be able to both figure out what page numbers would > be in the normal sized output AND what it is on the actual device. All > without having to add extra meta data about where non-explicit page break > occur. > > So basically it renders the pages twice: > > A) Once in the original size. This can be done in a way so the end user > doesn't actually have to see it. The page numbers are retrieved from this > version. A could be made to be exactly equal to the print version (or the > other way round: in order to create the print version, one simply prints > out A). > > B) A second time for the user to see it in the size appropriate for the > zoom level and screen size. > > There are various ways this could be presented to the user in the User > Interface. For example the "Jump to page number" function could be using > the page numbers retrieved from A but then jump to the correct location in > B. And the page numbers shown in the corner of the pages could also be the > ones retrieved from A (that would mean several pages in a row could be > displayed with the same page number and one B page could have two page > numbers if it happens to span over the break between two A pages. > > > > > > > > -- > > Kaveh Bazargan > > Director > > River Valley Technologies > > @kaveh1000 > +44 7771 824 111 > > www.rivervalleytechnologies.com > > www.bazargan.org > > > > > > -- > > Kaveh Bazargan > > Director > > River Valley Technologies > > @kaveh1000 > +44 7771 824 111 > > www.rivervalleytechnologies.com > > www.bazargan.org > -- Kaveh Bazargan Director River Valley Technologies @kaveh1000 +44 7771 824 111 www.rivervalleytechnologies.com www.bazargan.org
Received on Thursday, 6 August 2015 15:40:50 UTC