W3C home > Mailing lists > Public > www-math@w3.org > March 2008

Re: Exploring new vocabularies for HTML

From: David Carlisle <davidc@nag.co.uk>
Date: Sat, 29 Mar 2008 22:05:05 GMT
Message-Id: <200803292205.m2TM55hC024994@edinburgh.nag.co.uk>
To: ian@hixie.ch
Cc: public-html@w3.org, www-math@w3.org


> That has been proposed, as have other options such as LaTeX, eqn/neqn, the 
> native formats of tools such as Mathematica, Maple, and OpenOffice.org 
> Math, standards such as ISO 12083, and not doing anything at all. I am 
> considering all options.

Hmm,None of those can really be realistically considered for a
declarative browser environment, they are all intricately linked to
specific engines. Once could define a new declarative language adopting
the flavour of one of those, but why? Experience shows it wouldn't be
popular with users of the systems (a latex-like language is no easier and
often harder to translate to real latex than a language that doesn't
look like latex at all, same for the other systems)

> Interesting stuff. Do these packages also natively support MathML import? 
> (To clarify, you are talking about native support, right? Not support 
> after installing third-party plugins.)

yes, I mentioned those "mainstream" packages rather than the smaller
specialist mathml editing systes because it shows how math systems are
converging on mathml as a common format for mathematics and html5
converging on something else will put us all back a decasde.

mathematica and maple inport and export mathml as part of their standard
functionality, MS Word 2007 ships wth stylesheets (XSLT) going from its
internal mathematical form (OOXML) to and from mathML and it
automatically applies those stylesheets on cut and paste so you can cut
expressions from a browser and paste them into Word with all the
clipboard data being in mathml (and the user seing no markup at any
stage) openoffice.org (and presumably all other ODF supporting suites
such as koffice or the IBM suite) store mathematics in MathML as that is
what ODF specifies.

> What problems would this introduce?

It makes it much harder to style (or at least understand the styling of)
the mathematics, oe reason why mathml fully tags every token is that it
makes each token individually aailable for styling, something that is
far or likely to be required in math than in natural language text,
where you are less likely to want to style individual words or letters.
If the "implied" elements are available to CSS selectors then presumably
the thing is still stylable but it is rather obscure and people have to
understand the exact nature of the implied elements in order to use
CSS. the other problem is that if editors start generating "mathml" with
htmlised math with unmarked up text runs such as this, then they break
the entire existing mathematical tool chain, which either has to support
this new language you are proposing, or have to explian to end users why
mathml in html is different from mathml as specifed.

The rules for inferring what's a number, what's an identifier, what's a
sequence of identifiers with invisible times operator between really
isn't simple, especially if you move away from ascii (as surely you would
have to) so you would probably end up having to refer to large unicode
character tables to decide what's a number etc. A feature of
presentation mathml is that those decisions get made by the author (who
hopefully best understands the expression) and aremn't left to be
inferred by a later system.

> <semantics> and <annotation-xml> are nice in theory, I agree, but are they 
> really necessary? While I understand that math experts today might use 
> them, it seems highly unlikely that the mass market would ever bother.

current experience shows you are entirely wrong here, the mass market
uses this probably more than the "expert" hand writing the mathml in an
xml editor. OpenOffice.Org generated mathml for example is always in a
semantics element with an annotation carrying the openofffice.org 
linear syntax, design science's editors do something similar. maple can
at user option write just presentation, mathml, just content mathml, or
a semantics element with presentation annotated by content.

> Something else that would be useful is a summary of the MathML schema. I 
> couldn't find anything human-readable in the MathML specs, and the DTD is 
> not optimised for casual reading. Is there anything like that available?

for mathml3 we are authoring the schema in relax ng and deriving (or I
should say will derive) xsd and dtd. Actually though the authoring of
the formal schema is lagging behind the specification of the prose text
of the specification. If you have any particular style of comment
annotation that you'd find helpful drop us a line and we'll see what we
can do..
current draft of presentation part is

> That kind of wishy-washy rule isn't going to fly for HTML5. :-)
quite, but you have "control of the browser" back when doing mathml2 it
was clear that you couldn't (in a specification of a language designed
to be a fragment on a larger document) specifier global rules for
language merging and error handling, so "wishy washy" was the best we can
do. Make no mistake, lack of specification in this area does not imply
lack of interest in seing these problems get fixed!!

> That seems unnecessary; HTML and MathML together should be defined in 
> enough detail that no other spec is required to define how they work 
> together, IMHO.

whataver, Historically it's been clear that the math group couldn't do
this on their own, but advice has oscillated between whether it should
be done by HTML (remember it began as html-math) or general namespace
magic, or specific CDF magic or... I think if _someone_ specifies 
how this is supposed to work in an interoperable way we'd all be
happy:-) Actually we'd be happy from math-on-the-web perspective, but
if it does turn out to be specified by simply specifying
html+mathml(+svg) rather than refering to a specific mechanism for
combining languages, we'd probably need to port the effect of teh
html+mathml mixing to other formats (docbook, TEI, xsl:fo are  for
example three common xml formats that commonly embed mathml for formulae
and might conceivably want to embed the host language within the mathml
formula). But that isn't (directly) your problem, if you can help us sort
out the interaction of html + mathml, we can probably do something for
the others....) 

best if I again note these are personal comments not checked with the WG.

The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. 
Received on Saturday, 29 March 2008 22:05:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 20 February 2010 06:13:00 GMT