- From: Ian Hickson <ian@hixie.ch>
- Date: Tue, 1 Apr 2008 06:25:00 +0000 (UTC)
- To: Neil Soiffer <Neils@dessci.com>
- Cc: Bruce Miller <bruce.miller@nist.gov>, Sam Ruby <rubys@us.ibm.com>, Robert Miner <robertm@dessci.com>, Henri Sivonen <hsivonen@iki.fi>, David Carlisle <davidc@nag.co.uk>, public-html@w3.org, www-math@w3.org
On Mon, 31 Mar 2008, Neil Soiffer wrote: > > If you need any more examples of why parsing math is harder than it > might seem at first blush, let me know. I know of probably a dozen off > the top of my head and could probably double that without a whole lot of > work. Please do provide such examples, that would be exceedingly useful. > One the consequences of the above rule is that content MathML will not > be part of HTML5. Speaking for myself, I can live with that as that has > been the case for Firefox for years and fits with the idea that users > should supply style sheets or other means to specify how to present the > content. > > One area that has been the focus of much discussion is semantics, et. > all. I strongly recommend those tags be included. I don't understand; the two paragraphs above seem to contradict each other. Could you elaborate on what you mean when you say that not including Content MathML is ok, and on what you mean when you say that it is important that we include semantics? > There have been theoretical arguments that it allows data to be out of > sync, but practice has shown that this is a minor concern at best. On the contrary, experience with the Web has shown that including redundant data (e.g. accessibility metadata, page description metadata, and so forth) is actively harmful, as it is almost always out of sync with the data seen by most users. It is also the case that most people wouldn't know it was available. I would imagine that a much better and more productive way to provide Content MathML to users would be to include the Presentational MathML inline, and then have links for users to download separate MathML files containing the Content MathML. > As another data point, Mozilla's implementation of MathML initially left > off semantics -- this caused most MathML to fail in Mozilla because most > MathML is generated by program, not by hand and most programs use that. > Its omission was an oversight, due to semantics not be listed in the > presentation chapter. It was added in and now Firefox happily accepts > semantics. When you say it "accepts" it, do you mean it ignores it? What would it mean for the HTML5 language to "support" semantics? Given that every element supported must be explicitly handled, would it mean including support for all 140+ Content MathML elements explicitly in the parser? > The cost of supporting semantics is minimal Depending on what you mean by "supporting semantics", the cost may be far from minimal. > and I hope you consider it part of "Classic MathML" as it occurs in the > majority presentation MathML on the web. Do you have any precise numbers on this? It would be interesting to study this in more detail. (I did an ad hoc survey of half a dozen pages containing MathML collected mostly at random by people who did not know what the pages were to be used for, and my results strongly suggested that on the contrary, most pages that contain MathML only contain the Presentational MathML variant, and no <semantics> element nor Content MathML. However, this sample is far from fair. > One thing to note in your above example is that you have used two named > entities. I believe that these have been ruled out for HTML5. Nothing has been ruled out. > The lack of such named entities will make it much tougher to hand author > math (in any form) in HTML5. Yes, I think we would probably want to include them. I understand there is some issue with φ, though. > One unfortunate thing about the discussion on hand authoring is that it > has mostly been devoid of facts. Some *facts* on percentages of > hand-authored vs machine-authored HTML should be part of a reasoned > discussion, but sadly neither side has produced any such facts. Indeed. Unfortunately it isn't clear how to collect such information. My experience has been that many pages are in fact hand-authored, either directly in a text editor, or through CMS systems that provide raw HTML editors, or through templates that are hand edited. I do not think we can forgo addressing the needs of hand-authoring content creators. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 1 April 2008 06:25:45 UTC