[whatwg] Mathematics in HTML5 from juanrgonzaleza@canonicalscience.com on 2006-06-04 (public-whatwg-archive@w3.org from June 2006)

From: <juanrgonzaleza@canonicalscience.com>
Date: Sun, 4 Jun 2006 06:33:28 -0700 (PDT)
Message-ID: <3093.217.124.69.253.1149428008.squirrel@webmail.canonicalscience.com>
James Graham wrote:
>
> No. I propose that the [X|HT]ML syntax follows the LaTeX model as
> closely as  possible within the constraints imposed by the XML data
> model. This should make  it easy for people to write converters which is
> the _only_ thing that matters  for high adoption.

Do you claim that high adoption of HTML for textual content, SVG (recently
canvas) for graphics is due _only_ to existence of LateX converters? Do
you also suggest that increasing importance in submission of scientific
documents in MSWord (also Physical Review accepts MSWord submissions
today) is due _only_ to existence of LateX converters? Or are there other
points to be evaluated?

I know many people writing equations never (NEVER) used a TeX-LateX
system. That people would aknowledge publishing of mathematical equation
on the web in a simple and cheap way in HTML without complicated updates
of technology, and exhausting dealing with extensions, MIMES, namespaces,
special tools, plugins, fonts, or specialized browsers. For those millions
of people existence of a good Latex conversor does not matter.

Still note I said in a previous communication that conversions from LateX
to HTML-Math would be cheap when compared to transformations to MathML.

> Using border only my be more reliable but it is unacceptably poor. This
> issue  must be solved. If it cannot be solved within the CSS model, CSS
> must be  improved or the enterprise must be dropped. There is no point
> in wasting time on  something that produces embarrassingly
> unprofessional looking output.

The same directive applies then to generation of HTML pages or MathML
fragments from TeX-LaTeX-IteX sources. Many pages look completely
unprofessional for a web designer. For instance, since HERMES confounds
differential operators with d variables, and simulate prescripts via empty
<mrow> and since Distler IteX has splinted decimal numbers in a stupid way
and is encoding (ds)^2 as 2s ds then both approaches may be dropped. True?


"Alexey Feldgendler" wrote:
>
> Here is what I can add:

Whereas I agree on next points, it would be interesting to compare MathML
2.0 can offer us in native browser as Firefox.

> * perfectly kerned x/y style fractions (often poorly simulated in HTML
> as   <sup>x</sup>/<sub>y</sub>)

Approved?

Current CSS can offer interesting results

17 <span class="frac"><sup>13</sup><b>&#8260;</b><sub>32</sub></span>

span.frac sup, span.frac sub
        {font-size: 68%; vertical-align: baseline; position: relative;}
span.frac sup {top: -0.5em; left: 0.1em;}

> * correct continuation of long fractions on the next line

No

> * stacking of multiple over/underscripts

Yes

> * stacking of multiple signs like tildes, arrows etc above variables

Yes?

> * stretching of tildes etc over complex expressions

No

> * stretching of brackets and integrals around complex expressions

No

> * matrices with cells of uniform size (as to accomodate for the largest
>  expression found)

Yes?

> * nice embedding of inline formulae in paragraphs of text (without
> unnecessarily increasing line spacing)

And reducing font size?

>>> So far people mentioned radicals and glyph shaping/kerning.
>
>> Another obvious issue is stretchy characters like integral signs and
>> brackets. Is the CSS model poerful enough to allow for this? If not,
>> the   mosel needs to improve.
>
> TeX doesn't scale glyphs. It selects glyphs of different sizes, and for
>  those that are larger than the larges glyphs available, it uses a pair
> of   glyphs for the ends and fills the space between them with the third
> glyph   (a line segment). But this approach is not possible in today's
> CSS, either.

Not in MathML.


Michel Fortin wrote:
>
> I'm pretty sure that with SVG and CSS 3 border-image[1] it wouldn't   be
> too hard to have professional looking scalable radicals,
> integrals, and brackets. Matrix would be taken care of by inline
> table, faction with inline blocks.

I do not know about CSS 3. There are already SVG formatters for Math with
TeX quality (but needing of TeX fonts of course).

> What could prove a little harder is positioning of integral
> endpoints, as well as lower and upper bounds of summation and product
> symbols, without resorting to awkward markup.

Have you seen XML-MAIDEN DTD. The markup for under and over scripts is
nicer than that from MathML.

> But it'd certainly be a lot easier for browser implementors to add
> some math-specific CSS properties for the missing parts than to
> create a full MathML implementation.

Of course! In only a months, George has been able to display math in
almost any current browser (limitations becoming from lacking support for
some CSS properties). In 10 years, several specifications, and lot of hype
MathML is partially (less than a half) supported in Mozilla Firefox and
?brothers?. MSIE needs of third party plugin and other browsers do not
plan support for MathML.

Morevoer MathML cannot be easily improved, e.g. nobody has been able to
develop a CSS math module for MathML due to awful (incompatible) MathML
design.

Michel Fortin wrote:
>
> Le 2 juin 2006 ? 5:08, White Lynx a ?crit :
>
>> 1) Which markup do you think fits better in the scope of HTML5?
>> 	a)
>> 		<div>
>> 		(X)HTML document may contain math formulae, like
>> 		<formula>
>> 		ax<sup>2</sup> + bx + c = 0
>> 		</formula>
>> 		</div>
>
> While this may be better than the MathML counterpart, I'd prefer this
> markup:
>
>      <p>
>      (X)HTML document may contain math formulae, like
>      <formula>
>      <var>a</var><var>x</var><sup>2</sup> +
>      <var>b</var><var>x</var> + <var>c</var> = 0
>      </formula>
>      </p>
>
> It's more verbose than what you suggested, but still way simpler than
> MathML.
>
> The advantage of this notation is that a software tool could deduce
> the semantics using the following rules:
>
> *   Each <var> element represents a variable (permitting words to be
>      used as variable when appropriate).

Good point, moreover this fits my previous point one can reuse HTML
elements instead inventing new ones. In MathML you may use <mi> for
duplicating available <var>.

> *   <sup> contains the exponent of the preceding element or number.

And if one want group elements, one could reuse <span>.

> By understanding "+" and "=" as operators, "0" as a number and by
> applying the usual operator precedence, a tool could convert that to
> something understandable by other math software.

Effectively, there is not reason to encode each time that an operator or a
number is. If in some occasion the meaning changes over the default, one
could use <var> or <span> or else. For example, usually 1 is a number, but
it could be a matrix, then you could write <var class=?matrix?>1</var> or
something similar.


> Of course, people could still write equations in a non-verbose/non-
> semantical way, without <var>, but nothing is going to prevent that
> anyway. What's interesting is that if you forget some <var> tags, you
> notice it immediately from the browser rendering as the variables
> aren't italic. There is tag with "invisible" effect.

Also there is ?;? in text but people often do not use punctuation when
writing. But this is a question of education not a question that
semicolons were not useful.

> The other point I'd like to make is that a formula element shouldn't
> be required for all mathematical expressions. If I want talk about
> variable x in the middle a paragraph, I shouldn't need to surround it
> like this: <formula><var>x</var></formula>. Using <var>x</var> ought
> to be suffisent. The same applies if I want to include x^2 in the
> text, <var>x</var><sup>2</sup> should be enough.

In fact, this is usual in academic publishing mathematical DTDs. Simplest
formulae are not encoded with full mathematical markup. E.g. Elsevier
encodes simplest equations or single variables with simple text and sub or
sup tags.


Juan R.

Center for CANONICAL |SCIENCE)
Received on Sunday, 4 June 2006 06:33:28 UTC