Re: [EXTERNAL] Some braille references from Deyan Ginev on 2021-07-14 (www-math@w3.org from July 2021)

From: Deyan Ginev <deyan.ginev@gmail.com>
Date: Tue, 13 Jul 2021 23:37:50 -0400
To: Susan Jolly <easjolly@ix.netcom.com>
Cc: Neil Soiffer <soiffer@alum.mit.edu>, David Farmer <farmer@aimath.org>, Sam Dooley <samdooley64@gmail.com>, "Hammond, William F" <whammond@albany.edu>, "Noble, Stephen" <steve.noble@pearson.com>, Murray Sargent <murrays@exchange.microsoft.com>, Louis Maher <ljmaher03@outlook.com>, www-math@w3.org
Message-ID: <CANjPgh9ofEJGL1O9N3RbgrCuZFaTzpnfb9jwjanmjp_7HhkZ1Q@mail.gmail.com>

On Tue, Jul 13, 2021 at 10:09 PM Susan Jolly <easjolly@ix.netcom.com> wrote:
>
> Hi Deyan,

Hi Susan,

>
> I'm confused by what you wrote.

Apologies for the confusion. Assume that I write as a casual user of
Unicode, who has not been involved in any of the efforts you're
describing here. I will try to improve my use of terminology, so
please do correct me when I stray into misusing terms.

> My understanding of Unicode is that it
> distinguishes characters from glyphs and that a great deal of effort has
> gone into creating the Unicode set of over 100,000 unique characters.
> Characters in Unicode are distinguished by their numerical character codes,
> not by their visual appearance.  Unicode decided back in 1993 that a colon
> punctuation mark and the mathematical symbol for ratio are two different
> characters. If my understanding up to this point is incorrect, please
> correct me.

I am no expert on the history, but the final outcome of over 100,000
unique characters is indeed something I am aware of. The great deal of
effort of devising these characters has now been gracefully passed
down to developers who have to expect arbitrary unicode inputs in
their applications. Sometimes for good reason - sometimes one wonders.

[Aside] Actually, 1993 was likely when I wrote my first mathematical
colon, as I must have been in first grade. And I see to this day that
it is taught to primary school students as our preferred division
sign. Here's a Bulgarian Khan academy video to illustrate that:
https://youtu.be/d_Q8xICTFpQ?t=35

So add "divides" as the tenth notation in my list above.

>
> It is also my understanding that characters are displayed visually by glyphs
> with the Unicode tables providing a typical or reference glyph for each
> character.  However the visual appearance of a given character is not going
> to be identical in all fonts.

Certainly.

>
> The use of Unicode character codes aids in the automatic translation of math
> to braille.  Of course a given braille system cannot define easy-to-remember
> braille symbols for all of the Unicode characters so it needs some method
> for dealing with this issue.

It certainly *could* aid that. But the world is not necessarily
perfectly encoded in the correct Unicode characters. The examples of
the many notations above were meant to illustrate that. I can use the
regular colon (in other words U+003A) to encode all of the nine
distinct mathematical notations above, and a reader of a web page
would have no problem understanding what is written. All courtesy of
the textual context surrounding the expressions, which usually
suffices to obtain clarity.

My perspective is entirely of an external onlooker here, but also of
someone who wants to garden 700 million mathematical expressions from
arXiv.org. And in the case of arXiv, they need to be tackled in the
way people wrote them from 1991 till today. If people used ascii
colons for their ratios, it is a lot more manageable to expect the AT
tools to pass on a generic "colon" character to their readers, as a
baseline expectation. And then ratios can be inferred by the reader
based on context. At least I consider it a better design choice than
trying to guess where ratios are to be inserted heuristically, using
U+2236, and ending up with e.g. "the time is 14-to-10" for "the time
is 14:10". Quite amusingly, "14-to-10" is a valid reading of a time in
English, but encodes a completely different minute, the one at 9:46.
Imagine the poor reader that has to debug that mistake (hopefully they
don't work at a transport station).

And then if a willing author is ready to remediate, I would prefer
that they annotated in natural language the mathematical concept they
intended. Because while they can use U+2236 to explicitly designate
"ratio" in Unicode, there is no character for "such-that",
"coordinate-separator", "typing-judgement", "namespace-separator",
"ruby-symbol" ... and so on. Also looking forward to the new uses of
colons that are to be invented in 2021 and beyond.

To reuse an expression Sam Dooley threw my way when I joined the CG,
the effort of enumerating all possible uses of the same visual glyph
(say by introducing a new Unicode character per each meaning) is akin
to "trying to boil the ocean".

>  One posssibility  is direct representation of
> hexidecimal character codes.

Is the burden on memory feasible to remember a list as large as
Unicode? I certainly like the ability to know the exact code - I have
an active plugin in VSCode that shows the code points of the
characters under my cursor in the editor status bar. But I would
assume an unsuspecting user that stumbles on say " 𝕬 " would be a lot
more capable of mentally working with "MATHEMATICAL BOLD FRAKTUR
CAPITAL A" than they would be tackling U+1D56C in isolation.

Of course, if there is a convenient interface to easily switch between
the code and the name/description of a character, starting with the
hexadecimals could be workable, as one will remember them when they're
frequent and important. And keep jumping back to their text
descriptions when they aren't. That said, thinking in hexadecimals is
also a burden that's pretty unusual, it may take quite some practice
to do well. I will keep thinking as I continue learning about Braille
and reading the rest of the replies.

Greetings,
Deyan

>
> Susan J.
>

Received on Wednesday, 14 July 2021 03:38:29 UTC