- From: Johannes Wilm <johanneswilm@vivliostyle.com>
- Date: Thu, 6 Aug 2015 15:58:54 +0200
- To: Kaveh Bazargan <kaveh@rivervalleytechnologies.com>
- Cc: Dave Cramer <dauwhe@gmail.com>, Richard Ishida <ishida@w3.org>, W3C Digital Publishing Discussion list <public-digipub@w3.org>
- Message-ID: <CABkgm-Rok6ZNSf1_FhasviS6-Hvory1jkNv2Wqs6ncXVsqGP=g@mail.gmail.com>
On Thu, Aug 6, 2015 at 1:05 PM, Kaveh Bazargan < kaveh@rivervalleytechnologies.com> wrote: > Hi Johannes > > I am flattered by your comprehensive reply. My comments regarding TeX are > below, but I might not have explained myself well... > > I am not suggesting anyone should use TeX code, or even be aware that > TeX/LaTeX is involved. The point is that it is a back end automated page > make up engine. So XML/HTML can be converted to PDF very fast and at very > high quality with the TeX engine invisibly doing the work. > Ok, so you are proposing converting XML to LaTeX and Epub/HTML on a backend system? The main problem with that is the conversion mechanisms just about always need human intervention and that it's hard to impossible to get XML input files from authors. > > Here are my points, distilled: > > - I like the idea of HTML/CSS/Javascript creating fixed pages to be > read on screen with all kinds of interactivity > - I still question trying to create footnotes, floating figures and > tables, and typographic niceties which have primarily evolved for print on > paper, being done in the browser. To me, floating items only apply to > print, so no interactivity is not needed. Why not pass the info to an > engine that knows how to do it well? > > Is there not also a point in having footnotes and floating figures in ebooks (and have those still work when the user changes the font size level)? > > - The problem of floating items, complex math, large footnotes that > need to break across pages, and many other complex pagination problems have > already been solved in TeX. These are not trivial problems and I worry > about this working group reinventing the wheel, by starting to specify the > basics of pagination from scratch. In my opinion, in the end the only way > to solve the problem is to rewrite TeX in JavaScript! > > I have also been thinking of LaTeX in Javascript. But as far as I can tell, that in 2015 that would still be too slow. TeXLive is a few GB large, and if the user should wait for a few GB to download before the page is rendered, that likely wouldn't work. In a few years, when a few GB is nothing and processors are faster, this may be a viable alternative. As for these other items: I agree, they are quite complex. But it should be possible to do an "simple version" of most of those features that renders quickly, and then a more complex that will create even smoother designs for the situations when one has time. > > - Another problem I have is holding all our information in HTML as > opposed to XML. I worry about how clean and semantic the content will be. > after all HTML was designed to be forgiving, so even bad content will look > good. We are all excited about the amazing gizmos in html and how the > browser is the new publishing model, but what about 10, 50 or 100 years > time? Will these html files still make sense? What happens when the browser > is superseded? I am all for html tools and interactivity, but I suggest the > definitive content should be XML, not HTML. > > Good question. I would guess that it depends on what features. H1-H6 and P elements will likely still be readable for a long time. Also XMl files will be readable, but who can turn them into something visually attractive? Alreayd now there is a lack of a perfect WYSIWYG XML-editor and as we can see with XSL-FO, the future of turning XML into PDFs via a common standard is not secure either. And we are just a handful of years after "peak XML standardization" and likely still haven't reached "peak XML usage". So how about in 200 years? The situation with LaTeX is somewhat similar (see below). Noone can quite know what will happen and which standard survives, but with HTML we at least know that the number of users is extremely high so that there is a certain chance that those files will survive for a good while. That being said, the safest is probably to store files in several formats for long-term storage. In Fidus Writer we therefore used both simple HTML and simple LaTeX as storage formats for user content. > Actually TeX is the fastest page renderer. Standard TeX files create pages > at over 100 pages a second on a normal laptop, including complex math and > footnotes. And I am surprised you had problem running old files. You must > have been using style files which had not been maintained. The TeX engine > has been frozen for 30 years! > > Yes, and that's why I thought LaTeX was a great idea som 15 years ago. But then I suddenly had to open some files from 1996 in 2003 created by someone else, and I spent about a week figuring out how to rewrite the macros and going through the files by hand fixing small things that had changed. SHortly thereafter I received a file that was just a few months old, but had been written on a Mac (I was running Linux) and the line endings were different so I had to figure out how to convert them. Then along came direct support for other characters without using shorthands, by using XeTeX and support for ttf fonts in LuaTeX, just in slightly different ways. And again lots of stuff needed to be changed by hand or by a script which I would spend up to a few days developing, etc. . Then suddenly the maintainer of the main bibliography package I had been using, biblatex, disappeared into thin air. Others eventually tracked him down and took over package maintainership. And even a few months ago, when I acquired a new laptop with a new version of Linux Mint, I couldn't just use my CV compiler[1] as I used to, because the version of TeXLive that is available in the package manager has some bugs that have been fixed upstream but haven't yet been fixed in the version available in the package maintainer, so I needed to add some extra lines of code I found on some random website. Different than support on HTML, which can easily be found in books and online documents, LaTeX hep can mostly be found in obscure places as the information about how to do one particular thing correctly at times only exists in the head of 2-3 developers worldwide. Of course it has always been possible to get the content and with some time spend on the internet in forums, it's always fixable in the end. For a big organization that can afford a development team of 5-6 people who can spend all their time on this, this is likely no big deal to pay for such conversions. But the question is of course if it will continue to be developed, or if not at some stage too many say "well, latex is really good looking, but HTML can do just about all of it, and it's a lot easier for me to understand how to modify it, so I'll stick with that". At least for now it looks like HTML has the advantage in numbers.
Received on Thursday, 6 August 2015 13:59:33 UTC