- From: James Graham <jg307@cam.ac.uk>
- Date: Tue, 06 Jun 2006 15:49:26 +0100
juanrgonzaleza at canonicalscience.com wrote: > James Graham wrote: >> I could go on but >> at least in academic fields, LaTeX is either the only format accepted >> for publication or the preferred format. > > In mathematics, and theoretical physics sure, in rest of science? I doubt. > In chemistry, LaTeX is not preferred for example. Not just in theoretical physics, but in all varieties of physics that I have ever encountered. Nor, as far as I can tell, is th widespread use of LaTeX just limited to the mathematics and physics communities. It is also, for example, one of four accepted submission formats of the Royal Society of Chemistry (Word, Wordperfect, RFT, (LaTeX), the only format accepted by Electronic Notes in Theoretical Computer Science and the only acceptable format for IEEE Transactions On Wireless Communication. In general, Googling for these examples, I was unable to find a single print journal which accepted electronic submissions but did not accept LaTeX as a format. Indeed, it is the _only_ hand-authored format accepted by the journals I encountered on my brief search, except for one online-only robotics journal which published in HTML and accepted submissions in HTML. Even in that case, the submissions page is quick to suggest a LaTeX to HTML workflow, implying that engineers are another group who often work with LaTeX, a speculation lent credence by http://www.eng.cam.ac.uk/help/tpl/textprocessing/ which contains an extensive set of notes for engineers on using LaTeX and begins "TeX is a powerful text processing language and is the required format for some periodicals now"). Of course using Google to turn up a few journals hardly makes for a good sample and you can no doubt provide counter-examples but it is extremely disingenuous to suggest that only pure mathematicians and a small subset of physicists commonly use LaTeX - it is clearly in very widespread use wherever mathematical communication is required. > The key is that you learn any new tool when it is useful and solves > problems. TeX-LaTeX solves a minimum subset of problems of real life and > reason is not popular except in some academic communities. The only really > good point of TeX-LaTeX systems is on mathematical typesetting; textual, > graphical, diagrams, and others items are best done with different systems > and approaches. Ah. That would be called "doing one thing and doing it well". I've heard that it's commonly believed to be a good design principle. In this case, the problem I would like to solve is "how do we typeset mathematics on the internet so that people actually use the technology rather than ignoring it into oblivion"? We've already determined that LaTeX solves the same problem offline so it seems like a reasonable place to start when addressing the question for online publishing. >> You may think I >> am overstating this but I disagree - bear in mind that a significant >> fraction of astronomical (chosen merely because it is the field I know >> best) software is written in Fortran 77. For many of these people >> almost 30 years of language design has never happened. > > If Fortran 77 fulfills the needs they have no reason for the change but if > it does not fulfill then they will adopt Fortran 90, or C++, or Java, or > Maple, or anything else. Technically, all the languages you've suggested are clearly better than Fortran 77. They don't have irritating limitations like fixed column numbers. They have _very_ useful features like dynamically allocatable data structures. It would make many people's lives better to migrate away from these languages. But they don't - because they are in the business of doing research, not learning new technology - so they are always in a metastable state which perhaps doesn't provide the most long term benefit but does work well at any given moment. (Of course some people embrace new technology, particularly if it is relatively easy to use. But don't be fooled into thinking that people will use new technologies just because they are in some global sense "better" than the ones they are familiar with, particularly if there is no easy path from here to there). > There are old academicians still using ordinary mail for communicating > with colleagues. Is this an argument against e-mail or when designing a > new communication model would we think in a subset of guys loving ordinary > mail? Well it maps pretty well to ordinary mail. For example an email address like jg307 at cam.ac.uk corresponds to the addressing format commonly used in ordinary mail (starts with the name, becomes more general toward the end). But more importantly, there are a number of immediately obvious and tangible benefits to email. In particular the fact that it is near instant. I don't see anything in your proposals that offers anything like the same level of obvious and tangible benefits. > I always am perplexed of double measurement scale of TeX-people. They > rudely critique mathematical typesetting of programs such as MSWord. I'm not a "TeX-person", merely a LaTeX user and, in the context of this discussion, my "pro-LaTeX" stance is merely a practical one; I have come at it by considering the needs of the audience, not through a desire to advocate one particular technology. Nor have I mentioned MSWord, except as an accepted format for submission to some academic journals. Indeed, if anything I am quite an anti-LaTeX person - I would never consider using it for a poster or slide presentation for example. I have, however, used LaTeX to create the equations for a poster and embedded the resulting postscript into anothr package. That is closer to the level of interaction I am advocating). > However, most of web pages generated from TeX-LaTeX systems are really > unprofessional even at that small subset of static and boring academic > webpages. Indeed. But there are two main reasons for that: 1) latex2html sucks 2) Academics have no interest in learning any language other than LaTeX (did I say that already?). They have to use LaTeX to prepare documents for publication, it is the only language they know for typesetting mathematics and, in general, the web is not their major target medium. LaTeX generated websites tend to be html representations of lecture notes or papers that are primarily designed for consumption in paper or PDF formats. So the html version only exists at all because it is relatively little effort to produce it in addition to the main publication format. When that is not the case, there will simply be no html version provided. > People abandoned TeX-LaTeX in favor of best approaches in many places. Where? In no journal I could find. If you mean publishers, for archival, that is irrelevant because, on the web, most content is created by individuals who are not publishers by profession. The tools suitable for the two groups are quite different. > Some weeks ago I received a draft of manuscript prepared by a > mathematician and will probably be published in MSOR journal in brief. He > is not using TeX or LateX because limitations and write: > > <blockquote> > Mathematicians have been served well by TeX and LaTeX for their > mathematical typesetting. Too well, perhaps. At least, if an dedicated > TeXnician of the last > ten years has a chance to \relax and look about himself he will see that > the rest > of the world has moved on in several incompatible ways to the cosy world of > TeX. > </blockquote> So one person contacted you and made a comment, which has no substantial content I can discern? What's your point? Who are the rest of the world are where are they? Why should I listen to this person? For comparison I did a straw poll of two people who I work with, asking "will astronomers ever be prepared to learn languages other than LaTeX for typesetting mathematics?" they both answered "no". But I don't think it's really meaningful enough to talk about. >> This is why the web is liberally >> sprinkled with the ugly gif output of latex2html. If we want this >> situation to change, the _only_ solution is to allow LaTeX as a >> document creation format. > > For creation of unprofessional webpages or electronic documents? Okay. > Somewhat as anyone can create low quality webpages using ?save as? in > MSWord, but if you want professional webpages then MSWord is not the > correct tool. Similar thoughts apply to TeX-LaTeX. That doesn't follow at all. For example Google are successful in making excellent HTML+js applications starting from Java. If I write a program in C it's likely to be much better than an equivalent one I write in assembler. Writing a document in Docbook and converting to postscript is much easier than writing in postscript directly. Computing is full of examples of people writing in one language and transforming to something else for consumption. I am merely stating that for any meaningful adoption of our chosen output format, it must be compatible with the chosen high-level format of the majority of research scientists - LaTeX. > As an exercise let me comment ITeX output in one of your pages. I will not > review your web page ?I'll go and play with words and pictures?, and I > will say nothing on the quality of the rest of web design not in its > typesetting. (Nice job on the subtle implication that because my webpages won't win any awards for beauty I have no business in a technical discussion, by the way. In return I won't mention any of the dubious content on your webpages :) ) > You begin from an IteX source (a dialect of LaTeX) and next present the > MathML output generated. Then you claim > > <blockquote> > It's pretty clear which version is easier to enter, read and maintain. > </blockquote> > > Well. It is clear that IteX is easier to enter and read than MathML. But > if use this as an argument in favor of IteX then let me say that ASCIIMath > is still easier to enter and read. Therefore if easiner reallt matter one > would discard IteX and other Tex-LaTeX approaches. But is ASCIIMath so expressive? It certainly isn't so widely known. Therefore it won't be so widely adopted. I seem to have to keep repeating this point that compatibility with existing technology is important. The existing technology in the field of mathematical authoring is LaTeX. > However, IteX is not easier to maintain. If you are looking for basic > unprofessional encoding of mathematical formulae, then IteX is okay, but > if you are looking for professional encoding of formulae, IteX is not good > enough and this will obligate to you to learn CSS, XSL-FO, and p-MathML > for fine-tuning and maybe DOM, Javascript, or c-MathML (or even OpenMath) > if you want add interactivity and semantics to your encoding. No professional I know wants to do that though. They just want to present mathematical equations in a sane way. GIFs are not a particularly sane way - they are ugly and so not scale with the text on the page but, despite this, the evidence of the web suggests that they are the best we currently have. MathML is not sane - it is too hard to author. ITeX, though far form perfect, is much much better. [lots of irrelevant junk about the itex2mathml output] > Another point of disappointment is in the encoding of the differential. > The differential is encoded as a simple variable d. There exist special > entities defined in MathML DTD and also special Unicode fonts and the true > is those special character were designed with accessibility in mind. > Still, if by some reason the author wan not use the special differential > character, one can easily see that differential is not and variable or > identifier but a operator. Therefore, <mo>d</mo> is more accurate. The > same error appears in the other integral. First of all, that's a false argument. I could just have easily gone through my character map, found the differential-d and used that in the formula when writing the ITeX as I could when writing the MathML. The problem is, in absolute terms it's much harder than just writing "d". This is a big problem I have with any solution that requires the extensive use of characters not found on a standard US keyboard. The single best idea I've seen in this entire discussion is the text-transform:math* properties for CSS. Now consider: what extra information would we gain by going though our character map application until we find the right codepoint to express the d operator? Presumably in each case a visual UA will display almost exactly the same thing. An aural UA will probably read out "d x" (or whatever) in either case, in the same way a human would. I guess a hypothetical computer algebra package that can accept input from the web might get confused but that seems like such a marginal case that it's hardly worth optimizing for with the price of damaging language adoption. > The code is how we can see very deficient even ignoring accessibility > issues. Note that vectorial quantities are rendered in italic bold font. > Many authors and some journals prefer roman font for vectors. Imagine you > have 5 electronic documents containing 10 equations each one. Either you > learn MathML (and then you are obligated to study three or four language > even for simplest tasks) and modify by hand the 50 equations or either you > modify the IteX source. Since the IteX source is presentational, you would > change each \mathbf in the 50 equations (even using a macro or an > automated search and replace the task wastes time). Of course, as an author, one can improve this in TeX by writing a single \vec macro that changes the formatting to a vector style. Then it is a simple matter to change vector formatting everywhere with a single simple change to the macro definition. So, if one wanted to make life easy for LaTeX authors who envisioned targeting the web, one could provide a package that would add some mapping onto the more semantic constructs of the target language. But the majority of authors will have legacy content that does not use these features and it must be possible to convert that legacy content to the new output format if you want it to gain any traction, even if the content produced is not so suitable for wholesale style changes at a later date (which is a feature that authors have lived without for years). > How do encode this example in HTML-Math? Well, that may be debated here > but a workling possibility could be (I use MathML entities by commodity, > they could be substituted by Unicode) [lots of musings about language design] Note that designing a markup language that can represent maths is trivial by comparison to the task of making people use that language. My point throughout is that if you want people to use the language then backwards compatibility is key. > And what if I send a document? Would I send the source? The > final HTML? Both? It depends who you're sending it to, and for what purpose, obviously. If you were sending it to a coworker for editing, you would send the original source. If you've done something fancy manipulating the DOM of the final output, you would have to send that. It's no different to LaTeX-postscript (or any other conversion process) - in 99% of cases where the postscript can be regenerated from the original source, you edit the LaTeX, in the 1% of cases where you manually edited the postscript file, you'll have to work with that from now on. > I see no reason for limiting capabilities of a web markup by > satisfying a subset of academicians who want not waste their time on > learning best markup languages. I see no point in wasting time designing a document markup language that will be roundly ignored by ~100% of the people creating content. > Somewhat as HTML was not designed with > LaTeX as a ?document creation format? in mind but was derived from solid > and sophisticated SGML But HTML succeeded for 2 reasons: 1) It was simple (consider the relative semantics on offer with Docbook, for example, and the relative popularity of each) 2) It wasn't SGML. At least not for long. Browsers brought an unprecedented ease of authoring to HTML. Sure, it has come back to bite us now, but the fact that you could send almost any garbage to a browser and get something rendered on the screen made HTML accessible to people who wouldn't otherwise have been authors. >> I should say that, as far as I can tell, using LaTeX as the input >> language isn't the accessibility disaster that you make out. > > you? Have you noted that LaTeX was ignored by Maple, Mathematica, ISO > 12083, EuroMath, MathML, OpenMath... Yeah and look at how many authors are using those to create content (note: the primary function of Maple and Mathematica is computer algebra, not document creation). They may be used by big publishers, I don't know, but that's utterly irrelevant. The web is primarily a self-published medium and so things have to be easy for individual authors. Big publishers also use Docbook but that doesn't mean we should be trying to use it on the web. Those creating mathematical content largely use LaTeX. If our publishing solution is not designed so that LaTeX -> foo converters produce good-looking output then the exercise will be as futile as XHTML2 is looking to be.
Received on Tuesday, 6 June 2006 07:49:26 UTC