Re: ixampl goes Unicode from Liam R. E. Quin on 2022-08-18 (public-ixml@w3.org from August 2022)

From: Liam R. E. Quin <liam@fromoldbooks.org>
Date: Thu, 18 Aug 2022 16:20:27 -0400
To: Steven Pemberton <steven.pemberton@cwi.nl>, ixml <public-ixml@w3.org>
Message-ID: <23203006d882459e1c9de48baa07438f23b18d8d.camel@fromoldbooks.org>

On Thu, 2022-08-18 at 18:12 +0000, Steven Pemberton wrote:
> > It is now live.
> > I haven't yet updated the Unicode character classes though.
> Well, I'm slowly adding them, with the priority being classes L and
> Mn which are both used in the ixml grammar.
> 
> What a pain though! It's as if the Unicode design committee put no
> thought into it at all. For instance c0-ff are all letters EXCEPT
> they've stuck the multiply sign × in the middle, and the divide sign
> ÷ somewhere else in the middle.

The story (possibly apocryphal) was that in the final vote for ISO
8859, a claim was made that Œ and œ were not needed by any official
language, and that these should be replaced by × and ÷ to go with plus
and minus. This, it's said, was the Belgian representative being
antagonistic towards the French, who weren't present at the meeting.

Since œ is also used in English, i suspect it's apocyrphal.

> And then the Roman alphabet (in ASCII) has the lowercase letters in
> one range, and the upper case in another.

This was for bit twiddling purposes.

>  But the Latin range 100-17E has them alternating (upper, lower)*
> EXCEPT at #138 they stick an orphaned character, and then at #149
> they do it again.

Once you get beyond the original US ASCII 7-bit range, the bit-
twiddling no longer applies.  It's better than EBCDIC in which a-z are
not contiguous :)

-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org

Received on Thursday, 18 August 2022 20:22:12 UTC