- From: William F Hammond <hammond@csc.albany.edu>
- Date: Thu, 02 Jul 2015 14:11:02 -0400
- To: www-math@w3.org
- Cc: "Asmus Freytag (t)" <asmus-inc@ix.netcom.com>, Murray Sargent <murrays@exchange.microsoft.com>, David Carlisle <davidc@nag.co.uk>, Michel Suignard <michel@suignard.com>
On 20150627 at 143534-0700 "Asmus Freytag (t)" writes: > Unicode generally does not encode characters by usage. For > example there's no distinction between period, decimal > point, abbreviation point etc.. This reflects the underlying > situation, to wit, that this is a case of the *same* symbol > being used in different conventions. > > The downside is that it is thus not possible to use plain > text to capture which convention is intended (but nothing > prevents anyone from providing rich-text markup). The upside > is that data can't exhibit "random alternation" between > identical looking symbols; experience has shown that this is > a most likely outcome if "the same" item is encoded several > times, based merely on convention. Period, decimal point, abbreviation point: three different names and three different concepts commonly sharing the same symbol though not necessarily the same left and right spacing. As a point of argument (but not a request) they *should* be three different characters. Absent that, the typesetter with a proportional font must use various conventions, not completely reliable, to guess the spacing. Of course, commonly the user will be oblivious of these differences and the user's keyboard will have only one of these. But the astute user may want to be able to make distinctions. The distinctions can be made available, for example, in rich text, as you observe, in SGML, or in LaTeX. With a given oblivious user and a given typesetting suite random alternation will not occur. Other than for searching I fail to see why random alternation should be a problem. Are there other problems associated with random alternation? As to mathematical searching, searching for mathematical symbols is an order of magnitude more complicated than searching for text, e.g., multi-character math symbols, things like phi vs varphi, ..., so the small number of possible alternations (at most 256, the size of the U+21xx block, actually quite a few less than that) should not add much complexity to code for mathematical symbol searching. -- Bill
Received on Thursday, 2 July 2015 18:13:34 UTC