- From: David Carlisle <davidc@nag.co.uk>
- Date: Tue, 2 May 2000 16:23:46 +0100 (BST)
- To: rminer@geomtech.com
- CC: www-math@w3.org, rbs@maths.uq.edu.au
> If a MathML processor is > getting a parse tree from the DOM, then the extra <mchar> nodes are > more expensive than entities which resolve to character data. Yes, this is certainly true, as you get more nodes. (although doesn't the DOM produce a node for entity references as well?) But I think that the same would be true of any syntax other than entity references or character data. > A statistical profile of the density of characters needing to > be accessed through <mchar> in a typical document would be very > useful. It could, in some areas, be the majority of the character data in the mathml expression (but probably not the majority in a typical document consisting of text interspersed with mathematics). So the cost is real and not negligible, I would say. However, the only real alternative if we get rid of mchar is to deprecate entity references and use instead unicode character data, or numeric character references. Technically this works well, but removes any remaining pretence that it is possble to hand write or read MathML without MathML tools. This is another real cost, perhaps harder to quantify. Or we don't deprecate entity references. The schema issue isn't a total block on using entities as you can still use <!DOCTYPE whether or not you in addition specify a schema, but the harder issue is ensuring that MathML fragments remain well formed, which means ensuring that MathML that is "cut and pasted" from one place to another either has any entities expanded to character data as the expression is "cut" or the document into which the fragment is pasted has its DTD modified to include the MathMl entity declarations. (It may be that this is automatic as the original entities will have been expanded by the xml parser and so are not there to be cut, but I'd like to be sure of this:-) All of these are I think viable alternatives, but all of them have some nasty side effects. Currently I think I'm still happiest with mchar although I haven't implemented mchar at all yet in my own TeX based MathML renderer as it would be a pain to implement, needing to duplicate much of the support for unicode character data:-) If however the implementation costs on mchar turn out to be too high I agree that we might need to reconsider this....... David
Received on Tuesday, 2 May 2000 11:26:55 UTC