RE: [mediaqueries] MathML

It would be nice to search on mathematical semantics but searching based on the textual representation is most definitely very useful and much more important, IMHO. After all, that is pretty much all we have for searching in plain text. There's no reason to believe it would not also be effective for math. Math notation is essentially just an abbreviation of the full sentences mathematicians used before math notation was invented. Like English, math notation is just a human written language that follows similar rules.

I am not that hopeful for math semantic search. We can and should allow math semantics to be embedded in web content alongside the graphical math notation representation. And, when math semantics is present in content, it should be available to search. The problem is that very few authors would ever add math semantics to the math notation they write. It is just too hard for human authors typing math to get right. Even if they took the trouble, there would be a wide variety of semantic interpretations out in the world, making search difficult. Of course, software-generated mathematical content could make use of semantic markup but that is likely always going to be a small subset of written math notation for the foreseeable future. I doubt most math will be authored via computer algebra systems and the like.

On the other hand, math textual search can be very powerful and well within our reach. I am not talking about just searching for literal expressions, but a smart search that knows things like '+' is virtually always a commutative operator and can determine by context whether "5 in" means 5 inches or 5 x i x n. This level of processing during search is analogous to that which Google search applies with its handling of word endings, plurals, misspellings, etc. Once this sort of math syntax handling is done and the large amount of math in web pages made visible to search algorithms, standard "big data" techniques can be brought to bear on math search, gradually improving it over time.

Paul

> -----Original Message-----
> From: Florian Rivoal [mailto:florian@rivoal.net]
> Sent: Tuesday, October 04, 2016 4:41 PM
> To: Paul Topping <pault@dessci.com>
> Cc: Bill Kasdorf <bkasdorf@apexcovantage.com>; Tab Atkins Jr.
> <jackalmage@gmail.com>; Avneesh Singh <avneesh.sg@gmail.com>; www-
> style list <www-style@w3.org>; W3C Digital Publishing IG <public-digipub-
> ig@w3.org>; Peter Krautzberger <peter.krautzberger@mathjax.org>
> Subject: Re: [mediaqueries] MathML
> 
> 
> > On Oct 5, 2016, at 04:49, Paul Topping <pault@dessci.com> wrote:
> >
> > Right. And it is also useful for search based on an expression's
> mathematical structure.
> 
> Only partly, since the Presentation MathML markup tree structure does not
> match what the semantic tree structure would be. It is better than nothing,
> and things like https://github.com/mathjax/MathJax-a11y can make it better,
> but it doesn't seem ideal either.
> 
>  - Florian

Received on Wednesday, 5 October 2016 15:47:04 UTC