Re: italic dotless letters from David Carlisle on 2007-08-21 (www-math@w3.org from August 2007)

From: David Carlisle <davidc@nag.co.uk>
Date: Tue, 21 Aug 2007 22:21:15 +0100
To: R.W.Kaye@bham.ac.uk
Cc: lpadovan@cs.unibo.it, www-math@w3.org
Message-Id: <200708212121.l7LLLF3B011794@edinburgh.nag.co.uk>
> in the MathML entities.  The MathML DTDs provide a long
> list of character entities and it's the (few but some) errors
> here that make me concerned about their status.

yes it's worrying about bad side effects that make editing the dtd so
stressful, and mean that most changes are out of scope, but in the case
where new characters have been added by uniocde specifically to support
the entities it would seem odd not to change the entity to use them

take &jnodot; for example, that's defined to be j with a dot. It's hard
to imagine that any author or authoring system really went to the
trouble to use the entity if they really wanted a j with a dot, so
although the document (might) render differently it is hard to see this
as anything other than a fix for a previously (necessarily) broken feature.

that case it realativvely "easy" but there are harder issues, for
example as part of the work coming from the stix font project there has
been some attempt to make more consistemty sized sets of
smal/medium.large genometric shaped operators, square, diamond,circle
etc. It's not at all clear whether extra consistency in nominal default
sizing is worth the pain of changing existing mappings in these cases.
Although perhaps it is. 

> Secondly, I would like to use these names in a MathML-authoring
> application.  Rather than re-typing everything, what source of
> data should I use?  There are two versions of "unicode.xml" on
> the w3 web site at 
>  http://www.w3.org/Math/characters/unicode.xml
>  http://www.w3.org/2003/entities/xml/unicode.xml

currently the one in 2003/entities is the best but I have an updated
version which includes _all_ the data from the Unicode 5.0 UnicodeData
file It's being developed as part of MathML3 draft but as you'll see
the current draft doesn't have updated tables. I hope to get it out
for the next draft, If you need it urgently I could let you have a copy
but it's probably misleading at present at it isn't currently updated
for (eg) dotless j. I had also hoped that I'd be able to do a pass over the
final stix data when we knew what that was but that has slipped again.

> But actually I'm not even convinced I want MathML to provide a 
> long list of character character entity names.

In almost all scenarios it's better if the authoring evironment  offers
the characters by name or symbolically but just writes the characters
directly or as numeric references. Writing them as entity refs means you
have to have a DTd which akes processing fragments hard and slows down
processing of documents. But still the DTD has to define them (I don't
think it's acceptable that "updating" a mathml2 document to mathml3 by
changing the doctype reference to mathml3 make the documents not well
formed as the entity definitions have gone.  And if they are there they
need to point to "good" characters, both for internal consistemcy but
also as you say the same names are likely to be used in menus and other
help systems in authoring environments.

>  If names are a good thing, 
> perhaps they could be selected as an option (with different sets 
> of names to choose from depending on user's taste)?

A mathml instance can always point to a different set of names, but for
the reasons above I currently am proposing the mathml3 have _exactly_
the same set of names as mathml3 (but with some small definition
changes) despite the fact that that set of names is a rather uneasy mix
of styles with short cryptic iso names, longer lower case TeX names,
and camel cased mathematica names all mixed together.


David




________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. 
________________________________________________________________________
Received on Tuesday, 21 August 2007 21:21:27 UTC