- From: <asmusf@ix.netcom.com>
- Date: Mon, 27 May 2024 16:57:13 -0700
- To: Addison Phillips <addisoni18n@gmail.com>, 'Mark Davis Ⓤ' <mark@unicode.org>, 'Markus Scherer' <markus.icu@gmail.com>
- Cc: petercon@unicode.org, craig@unicode.org, asmus@unicode.org, 'Ken Whistler' <kenwhistler@sonic.net>, public-i18n-core@w3.org, 'Florian Rivoal' <florian@rivoal.net>, fantasai@inkedblade.net, unicoRe@unicode.org
- Message-ID: <8c8061e7-fa9e-4022-bade-66dd660ea158@localhost>
A UCD property that tries to express that something is or isn't "true" punctuation would require two things: 1. Some definition / criteria that can be applied universally, including new characters. 2. A definition that captures something that is valid outside the narrow use case of emphasis marks. Failing the latter, this could come under the scope of CLDR. We already had that discussion and that was the tentative conclusion. A./ On Sun May 26 08:33:56 PDT 2024 Addison Phillips wrote: Thanks Markus and Mark. I did not expect that there would actually be a change to the GC (which is why I wrote the sentence in the circuitous way that I did). However, to Mark’s point, the effect needed is similar to “NotReallyPunctuation”. My real question is: how do we get this on the agenda for the UTC in a future version of Unicode? And how do we track it? Do we need to produce some kind of proposal? Periodic emails are probably not the answer… Addison From: Mark Davis Ⓤ < mark@unicode.org (mailto:mark@unicode.org) > Sent: Friday, May 24, 2024 2:28 PM To: Markus Scherer < markus.icu@gmail.com (mailto:markus.icu@gmail.com) > Cc: Addison Phillips < addisoni18n@gmail.com (mailto:addisoni18n@gmail.com) >; petercon@unicode.org (mailto:petercon@unicode.org) ; craig@unicode.org (mailto:craig@unicode.org) ; asmus@unicode.org (mailto:asmus@unicode.org) ; Ken Whistler < kenwhistler@sonic.net (mailto:kenwhistler@sonic.net) >; public-i18n-core@w3.org (mailto:public-i18n-core@w3.org) ; Florian Rivoal < florian@rivoal.net (mailto:florian@rivoal.net) >; fantasai@inkedblade.net (mailto:fantasai@inkedblade.net) ; unicoRe@unicode.org (mailto:unicoRe@unicode.org) Subject: Re: Emphasis skip property? [W3C I18N Action #99] That may sound like a joke, and the name I mentioned certainly is, but we have done similar things before to address issues where compatibility constraints came into play. On Fri, May 24, 2024 at 2:21 PM Mark Davis Ⓤ < mark@unicode.org (mailto:mark@unicode.org) > wrote: > > I proposed changing some of the obvious errors in Po some years ago, but there was concern about disruption. I suppose we could have a NotReallyPunctuation property... > > > > > > > > On Fri, May 24, 2024 at 12:12 PM Markus Scherer < > > > > > > markus.icu@gmail.com (mailto:markus.icu@gmail.com) > > > wrote: > > > > On Fri, May 24, 2024 at 11:28 AM Addison Phillips < > > > > > > > > > > > > addisoni18n@gmail.com (mailto:addisoni18n@gmail.com) > > > > > wrote: > > > > > > The issue pertains to the use of emphasis marks (e.g. Japanese bouten). It is customary to skip punctuation characters in these emphasis systems. See [2] and [3] below for specific text (where there is a list of symbols affected). > > > > > > > > > > > > > > > > > > CSS found that the Unicode general categories don’t align nicely with which characters to skip. W3C doesn’t want to maintain the list of characters to skip/not skip: it would probably make more sense for Unicode to maintain it. Participants speculate that this might be achieved by splitting a general category or via some Unicode property (or some other mechanism). > > > > > > > > > > > > > > > > > > > > > > > > > > Splitting a general category is verboten. > > > > > > > > > > > > > > > > > > > > https://www.unicode.org/policies/stability_policy.html#Property_Value > > > > > > > > The enumeration of General_Category property values is fixed. No new values will be added. > > > > > > > > > > > > > > > > It sounds like one of the things you are asking is whether Unicode would change the General_Category of #%‰&@... from Po to So. Is that right? > > > > > > > > General_Category values can be changed, but the distinction between punctuation and symbols is fuzzy, and changing gc values for commonly used characters (especially ASCII) can be very disruptive. > > > > > > > > > > > > > > > > markus > > > > > > > > > > -- You received this message because you are subscribed to the Google Groups "Unicore (Member Discussion List)" group. To unsubscribe from this group and stop receiving emails from it, send an email to unicore+unsubscribe@unicode.org (mailto:unicore+unsubscribe@unicode.org) . To view this discussion on the web visit https://groups.google.com/a/unicode.org/d/msgid/unicore/083d01daaf82%241bb2e890%245318b9b0%24%40gmail.com .
Received on Monday, 27 May 2024 23:57:27 UTC