Re: update to xml entities draft

Murray Sargent <murrays@exchange.microsoft.com> writes:

> David Carlisle wrote that one could made definitions like
>
>>U+2102 DOUBLE-STRUCK CAPITAL C = Complex numbers
>>
>>Leaving U+1D53A free to be defined as a part of a generic
>> alphabetic run as
>>MATHEMATICAL DOUBLE-STRUCK CAPITAL C

I hope we're clear that just because at some point in
history the glyph that might have been used for U+1D53A had
been used for U+2102 does not mean that the *character*
named "DOUBLE-STRUCK CAPITAL C" would be the same as a
*character* in the alphabetic series "MATHEMATICAL
DOUBLE-STRUCK CAPITAL *".

Just to drive home this point, let me note that there are
annotations in my (slightly old) copy of the standard for the
characters in the U+21xx block corresponding to the vacant
slots indicating dedicated mathematical meanings.

The characters in the plane 1 mathematical alphabetic series
do not have annotations indicating dedicated meanings.

There is no need to change the names in U+21xx since already
they are different from what should be the mathematical
alphabetic series vacant slot names.

> One can't change the definitions of the math alphanumerics
> now since they are already encoded and Unicode has a
> stability guarantee.

Not necessary.

> In addition they are widely used in technical documents as
> defined. We might have been able to get away with such
> definitions before the math alphanumerics were added to
> the Unicode Standard 3.1 back in March, 2001.

My suggestion is simply that the vacant slots be filled.  That
won't break anything.  Of course, it might then take maybe a
decade before anyone could rely on them.

> For Microsoft Office apps, I wrote routines to work around
> the separation of the math alphabetics into the LetterLike
> Symbols and math alphanumerics blocks and it's complicated
> and even error prone. So I really wish that we had done
> something along the lines David suggests. But it's clearly
> water over the dam at this point.

Yes, *complicated and error prone*.

> ... +Asmus and Michel in case they want to defend Unicode's
> position of not duplicating characters. I'd argue that
> simplicity of implementation should play an important role
> in this regard. ...

Again, filling in the slots might generate redundant glyph
references, depending on glyph choices for the various
characters, but would not duplicate characters.

And all routes I can imagine for user-generated documents,
at least in most of Europe, Australia, North America, and
South America involve complicated code such as you describe.

                                    -- Bill


#mathml #html5 #unicode #unigaps #xhtmlTooComplicated

Received on Thursday, 25 June 2015 17:53:55 UTC