[whatwg] Mathematics in HTML5

On 9 Jun 2006, at 11:0AM, juanrgonzaleza at canonicalscience.com wrote:

>??istein E. Andersen wrote:

>>2) Fight verbosity

>><m>, [...] <frac>2<den>3</frac> and <root>3<of>125</root> [are] clearly
>>better suited than <formula>, <fraction>2<denominator>3</fraction> and
>><radical>3<radicand>125</radical>.

>However <frac>2<den>3</frac> is an shorthand for the full markup,
>because structures of kind {2 \over 3} are even to be avoided in TeX.

><root>3<of>125</root> was already proposed in HTML Math of 1994 and
>rejected because technical issues. Also rejected in ISO12083 math of 1995.

What i meant was to use <root>3<of>125</root> as a shorthand notation for something like <root><order>3</order><of>125</of></root>, in which case only the actual element names differ from the current proposal.

>>3) Assure compatibility with a reasonable subset of TeX

>absence of a model for prescripts is one of most important flaws in TeX,
>therefore do not wait that a TeX input can be magically transformed into HTML 5.

Obviously, it will not be possible to transform any TeX code into HTML 5.

Something like ${}^aB$ could be transformed into an HTML 5 prescript given the correct rules, but then something like ${}^{342}_4X$ would of course look different in TeX (probably incorrect) and HTML 5.

>>4) Make font selection simple and natural

>There is many options. In HTML roman font is default and one just markes
>variables as when one uses <i> for italic font. In Elsevier DTD for math,
>italic was the default and roman was marked via tag.

Very well, but then a choice must be made.

>I do not think that automatic mixing of roman and italic would be a good
>idea at the browser side if one search a rapid cheap implementation fully
>compatible with current standards.

That is probably quite right.

>However, this would be not a problem for authors, because one could
>implement a small js in a week that authors could use in their computers
>asisting them to authoring math.

Such a script would certainly not fit everyone's needs and desires. It could be potentially useful to many, but the language should be such that hand-authoring be practical -- otherwise, the perfect integration with HTML will be lost.

>>How are non-italic variables supposed to be handled? Using attributes,
>>like <var class="italic">, <var class="bold">, <var
>>class="blackletter">, <var class="roman">, etc. may be part of the
>>solution, even though it would be quite verbose.

>HTML is more verbose than TeX but is less erratic.

That is a fair point.

>I think that people can perfectly use
><var class="vector">F</var>
>instead
>\mathbf{F}
>if you dislike the class attribute, then try something like
><var><b>F</b></var>

A few issues still remain to be solved, though:

Boldface does not necessarily mean vector, and vectors are not always printed in bold type. Presumably, you mean that classes like `vector' need not be defined in the specification, that the choice is up to the author, and that a custom CSS style-sheet can be used to define the font. (This would require CSS font-families for Fraktur and double-struck/blackboard bold.)

This approach would entail introducing semantic or quasi-semantic mark-up to encode an important part of a formula's visual appearance. Obviously, LaTeX commands like \mathcal and \mathbb indicate no semantics, so the only sensible solution would be to transform this into something like <var class="cal"> and <var class="bb">. If this is going to happen, the classes should probably be defined in the specification.

The re-use of <b> and <i> makes sense in a way, but these two tags do not suffice to access all the different fonts. (Unicode report [1] lists 14 alphabets: roman, italic, script, Fraktur, sans-serif upright, sans-serif slanted; bold versions of all these; double-struck and monospace.)
    [1] http://www.unicode.org/reports/tr25/
It is not clear that <b><i> (or <i><b>) would be a good choice for bold italic variable names, as it would lead to encoding e.g. <i>a </b>b</b></i> (could be a vector b scaled by a factor a), whereas something more like <i>a<i> <bi>b</bi> would be wanted. Another possibility would be to use something like <var i>a</var> <var b i>b</var>.

Each approach has its problems. Anyway, the specification should probably not try to avoid the issue of font selection.

-- 
Andersen

Received on Friday, 9 June 2006 16:27:29 UTC