Re: Exploring new vocabularies for HTML

Bruce Miller wrote:
> [I'm speaking for myself, here; not for the Math WG, nor even NIST]
> Henri Sivonen wrote:
>> On Mar 30, 2008, at 00:23, David Carlisle wrote:
>>> But as I just replied to Ian, annotation-xml for anotation
>>> presentation mathml with content is used a lot, and anotation is often
>>> used to anotate mathematics with alterative (ofetn original source)
>>> forms suct as openofiiceorg syntax, or maple, or TeX...
>> I think annotation and annotation-xml are harmful when used as 
>> alternative representations of a MathML subtree as opposed to a 
>> validator-pleasing escape hatch to SVG.
>> Why would anyone include an alternative format for a MathML subtree 
>> unless they expected a human or a piece of software to process the 
>> alternative format instead of MathML? That leads to a situation where 
>> different consumers use different alternatives that might not be 
>> equally expressive and in sync.
> I think there's some confusion here --- tho' it may be mine.
> I personally think the most compelling case for annotations,
> especially in a web context, is to provide presentation MathML
> for display to humans, along with the corresponding content
> MathML (when available) for export to applications (or perhaps
> for audio rendering, or ...).  For any non-trivial math, for
> software to infer the meaning from the presentation is really
> just a wild guess; humans do somewhat better.  Perhaps the
> notion of a "semantic web" has lost its popularity, but this
> use case seems to be exactly what a semantic, and accessible,
> web needs.
While I share the concern that in an environment, where html+Math 
snippets are edited for resue, redundant representations can get out of 
sync, much of the Web is not that way. Most of the content is never 
reused or changed, so maybe we just have to live with the danger :-).

To restrict the discussion I would just drill in on the use case 
(important to the Math WG folks) where we annotate presentation math 
with content Math and point out that the two representations are not 
redundant. Even in the K-14 fragments that MathML covers we have 
situations, where on the one hand we have multiple traditional/national 
notations for the same mathematical object, and on the other hand we 
have mathematical objects that share the same notation. If we go to 
higher math or aural representations, or to more international contexts, 
then the problem increases. It is our experience that even though math 
is touted as "international", it does have its accessibility issues on 
the web, and mixed/parallel presentation/content markup is a pragmatic 
answer for this. Ideally, we would be able to infer presentation from 
content and vice versa to cut down in "redundancy", but in an 
international, culturarally mixed environment like the Web this seems 
too difficult.

One use case that (I think) highlights the utility of the mixed markup 
is the case of mathematical search engines (which admittedly are not 
mainstream yet). These index math representations (in whatever form they 
can process), normalize them wrt. content (in some internal index 
format), and then answer content-oriented queries about mathematical 

As general search engines work on the surface web, they can expect to 
find presentation-oriented math in web pages.  As 
presentation-to-content conversion is a necessarily heuristic process it 
is _much better_ to let the author specify the content object hidden in 
the web page if she so choses, and it would be good if html5 would allow 
for that by supporting <semantics>/<annotation-xml> ...


 Prof. Dr. Michael Kohlhase,       Office: Research 1, Room 62 
 Professor of Computer Science     Campus Ring 12, 
 School of Engineering & Science   D-28759 Bremen, Germany
 Jacobs University Bremen*         tel/fax: +49 421 200-3140/-493140 
 skype: m.kohlhase   * International University Bremen until Feb. 2007

Received on Sunday, 30 March 2008 05:39:07 UTC