Re: italic dotless letters from Richard Kaye on 2007-08-22 (www-math@w3.org from August 2007)

From: Richard Kaye <R.W.Kaye@bham.ac.uk>
Date: Wed, 22 Aug 2007 13:44:41 +0100 (BST)
To: "David Carlisle" <davidc@nag.co.uk>
Cc: r.w.kaye@bham.ac.uk, lpadovan@cs.unibo.it, www-math@w3.org
Message-ID: <1372.80.195.165.6.1187786681.squirrel@web.mat.bham.ac.uk>
>
>
>> in the MathML entities.  The MathML DTDs provide a long
>> list of character entities and it's the (few but some) errors
>> here that make me concerned about their status.
>
> yes it's worrying about bad side effects that make editing the dtd so
> stressful,

Agreed. (Very much so.)

>and mean that most changes are out of scope, but in the case
> where new characters have been added by uniocde specifically to support
> the entities it would seem odd not to change the entity to use them

Sure.  Odd.. but a stated policy is required.

> take &jnodot; for example, that's defined to be j with a dot. It's hard
> to imagine that any author or authoring system really went to the
> trouble to use the entity if they really wanted a j with a dot, so
> although the document (might) render differently it is hard to see this
> as anything other than a fix for a previously (necessarily) broken
> feature.

I was almost convinced by this argument.  But then I pictured
a document that displays correctly until the day the MathML DTD
is "fixed" and suddenly all those j's become big red ?'s because
the glyph at the new codepoint is missing on the user's machine.
The user may not even have been aware there had been an update in
the DTD. So a change in these entities could mean something more
significant than one kind of j replaced by another.  But I am not
saying that MathML3 shouldn't fix bugs in MathML2.  Just that MathML2
shouldn't be fixed without a new version number and the changes clearly
documented.

>> Secondly, I would like to use these names in a MathML-authoring
>> application.  Rather than re-typing everything, what source of
>> data should I use?  There are two versions of "unicode.xml" on
>> the w3 web site at
>>  http://www.w3.org/Math/characters/unicode.xml
>>  http://www.w3.org/2003/entities/xml/unicode.xml
>
> currently the one in 2003/entities is the best but I have an updated
> version which includes _all_ the data from the Unicode 5.0 UnicodeData
> file It's being developed as part of MathML3 draft but as you'll see
> the current draft doesn't have updated tables. I hope to get it out
> for the next draft,
Good
> If you need it urgently I could let you have a copy
> but it's probably misleading at present at it isn't currently updated
> for (eg) dotless j. I had also hoped that I'd be able to do a pass over
> the
> final stix data when we knew what that was but that has slipped again.
No I don't think it is urgent. Especially if the list of names
remains the same. I was more concerned that the new MathML3 specs
should make the status and revisions policy (and their consequences)
quite clear, and if at all possible indicate the definitive data that
application developers can use.
>
>> But actually I'm not even convinced I want MathML to provide a
>> long list of character character entity names.
>
> In almost all scenarios it's better if the authoring evironment  offers
> the characters by name or symbolically but just writes the characters
> directly or as numeric references.
Agreed.  (That is what mine does.)

> Writing them as entity refs means you
> have to have a DTd which akes processing fragments hard and slows down
> processing of documents. But still the DTD has to define them (I don't
> think it's acceptable that "updating" a mathml2 document to mathml3 by
> changing the doctype reference to mathml3 make the documents not well
> formed as the entity definitions have gone.
Actually when I wrote this, I was wondering if it could be a module
in the MathML3 DTD that can be included or omitted as required.
Now that I look at the MathML2.0 DTD I realise this is already there
as a (barely documented) feature, using %mathml-charent-module;.
So I've learnt something! :)

> And if they are there they
> need to point to "good" characters, both for internal consistemcy but
> also as you say the same names are likely to be used in menus and other
> help systems in authoring environments.
>
>>  If names are a good thing,
>> perhaps they could be selected as an option (with different sets
>> of names to choose from depending on user's taste)?
>
> A mathml instance can always point to a different set of names, but for
> the reasons above I currently am proposing the mathml3 have _exactly_
> the same set of names as mathml3 (but with some small definition
> changes) despite the fact that that set of names is a rather uneasy mix
> of styles with short cryptic iso names, longer lower case TeX names,
> and camel cased mathematica names all mixed together.

Yes, OK.  I think you are right.  Especially now that I have realised how
to omit them or replace them with a different set.

Richard
Received on Wednesday, 22 August 2007 12:44:47 UTC