W3C home > Mailing lists > Public > www-math@w3.org > September 2007

Re: interesting developments -- Was: Enhanced support ...

From: Bruce Miller <bruce.miller@nist.gov>
Date: Fri, 21 Sep 2007 09:21:13 -0400
To: www-math@w3.org
Message-id: <46F3C549.6090207@nist.gov>

William F Hammond wrote:
> Thanks for the reply.
> 
> Michael Kohlhase <m.kohlhase@jacobs-university.de> writes:
> 
>>>    arXMLiv:  http://kwarc.info/projects/arXMLiv/
>>>
>>> I can see statistics, but I'm not able to see the translated
>>> results.  Are they available?
>>>
>> Bill, this is a pet project of Bruce's and mine, the statistics and
>> the transformation currently only apply to the first (and critical)
>> part of the transformation from LaTeX to a LaTeX-near XML format of
>> Bruce's. We want to go for the next step when we hit 50% success rate
>> in the first step (this should be relatively soon).
> 
> Translating LaTeX is not my recommendation for new documents.

Yes, perhaps; It kinda depends on author behavior
and at what stage you enforce good behavior.
The earlier you enforce it, the more successful
it is, provided they don't give up and use something
easier.

> Nonetheless it is important for legacy documents.  My guess
> (only a guess) is that one will never get much beyond 60% because
> of author glitches that LaTeX tolerates (unless authors get
> better).  I cited a well-prepared (and interesting) document that
> contains the equivalent of $(x$) at one point.

That's pretty typical; one could imagine some heuristics
to patch them up in some cases...
But "success" is more of a sliding scale. If you're 
more after content (which, of course, most of us
ultimately are), I'd suspect 60% is high. For presentation,
I'd say it's low.  That's for legacy; if you've got
any leverage over authors, it can all go better.

> I do agree that one will get best possible results by translating
> to a document type that models LaTeX.
> 
>                                     -- Bill
> 
> 


-- 
bruce.miller@nist.gov
http://math.nist.gov/~BMiller/
Received on Friday, 21 September 2007 13:20:48 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 10 December 2014 20:02:34 UTC