Re: Exploring new vocabularies for HTML

Ian --

You write:

> I would expect that we would allow the xmlns="" attribute on <math> to 
> have the MathML namespace, in the same way as we allow xmlns="" on <html> 
> to contain the XHTML namespace. It wouldn't have any effect, though.

Similarly I think I hear people saying that the non-presentational part
of <semantics> should be allowed in source , clippable in defined
circumstances, and visible in serializations, but inert for rendering

>> (2) may have omitted end tags,
> Well, so can the XML syntax, the difference would be that it wouldn't 
> cause a fatal error. Whether it is a syntax error or not is up for 
> debate, though I can certainly see strong reasons to make omitting 
> closing tags optional in MathML-in-text/html.

If these author services in <math> are going to be included (not that
I've seen assurance that it will ever actually happen), then I think
there's substantial possibility for omitting both openers and closers,
but less possibility for omitting only closers.

>> (6) and may in the extreme case, even omit tags for token elements 
>> (<mo>, <mi>, <mn>).
> Possibly.

I don't concede that there should be author services.  After all,
LaTeX authors don't commonly hack .dtx files.  What's the problem
with the idea of translating from a separate authoring language?
Wiki-style "markdown" is a well-known example.

But if there are going to be author services, they should be not
be half-hearted:

So, the handling does something like this:
1.  White space in math cdata is insignificant and ignored.
2.  Recognize strings that are numbers (which in the
    western world may contain periods and commas); wrap in <mn>
3.  The 2 char string "+-" indicates &plusMinus; "-+" indicates &minusPlus;
4.  Follow the TeX convention that loose individual word characters
    (say, in the sense of unicode enabled perl) are symbols, i.e.,
    wrap in <mi>.
5.  Non-word characters are by default operators, i.e., wrap in <mo>.
6.  So, for example, one needs to be explicit about an indicator that
    is a string of length two or more, e.g., <mi>Hom</mi>
7.  Don't gratuitously insert &invisibleTimes or &applyFunction.
    (In presentation markup it is reasonable to assume that there is
    always a default meaning for juxtaposition.)
8.  Braces {,} are TeX-like, i.e. invisible; they spawn <mrow>.
9.  Convention for superscripts and subscripts: Superscripts, indicated
    with ^, and subscripts, indicated with _, need bracing except when
    the script is a single (unicode) character.

[ Yes, it is expensive to convert


  to <msup><mrow>x</mrow><mrow>y</mrow></msup>.  You need to build
  the dom and then look at what surrounds <mo>^</mo> or <mo>_</mo>.
  Authoring services are intrinsically expensive.                    ]

Beyond that:

    Consider allowing braces { and } to have their LaTeX meaning,
    i.e, serve as shortrefs for <mrow> and </mrow>.  Then if one
    wants a brace as a symbol, say as a fence operator, then
    <mo>{</mo> or even perhaps \{.  Caution:  In translation from
    LaTeX unbalanced parentheses and other (visible) groupers (which
    will survive a latex run) is a major obstruction to successful

>> The rules for inferring elements are going to get very complicated very 
>> fast. For instance, does
>>      146,382

K-12 rules here; it's <mn>.

But consider providing an author-friendly way to input
vectors (more precisely, I mean ordered lists).  "mfenced"
is a very useful thing with a lousy name.

>> What about
>>      a b

This means <mi>a</mi><mi>b</mi>.
In LaTeX the other case would be marked up
with something like \mbox{ab}.

Now we come to the following that is really not author-friendly
at all:

> <math>
>  <mi>x <mo>=
>  <mfrac>
>   <mrow>
>    <mo>- <mi>b <mo>&PlusMinus;
>    <msqrt>
>     <msup> <mi>b <mn>2
>     <mo>- <mn>4 <mo>&InvisibleTimes; <mi>a <mo>&InvisibleTimes; <mi>c
>    </msqrt>
>   </mrow>
>   <mrow>
>    <mn>2 <mo>&InvisibleTimes; <mi>a
> </math>

With straightforward math segment authoring services (see things like
Robert Miner's webeq, Peter Jipsen's asciimath, David Harvey's blahtex,
Davide Cervone's jsMath, ...), but using notational conventions outlined
above, this could be:

x = <mfrac>{-b +- <msqrt>{b^2 - 4ac}}{2a}</mfrac>

(It would be sane here to allow omission of the mfrac closing tag.)

                                    -- Bill

Received on Tuesday, 1 April 2008 15:11:55 UTC