Re: agenda+ Diacritics in WCAG from MURATA Makoto on 2026-04-08 (public-i18n-core@w3.org from April to June 2026)

From: MURATA Makoto <founder@info-a11y.jp>
Date: Wed, 8 Apr 2026 21:35:50 +0900
To: Andrew Cunningham <lang.support@gmail.com>
Cc: public-i18n-core@w3.org, "Phillips, Addison" <addison@amazon.com>
Message-ID: <CAHw=6E1n5_wiw=zBmJbTWvzHv_RN4yTDSTuYNVU=eoERLJ8u-A@mail.gmail.com>

In PropList.txt of the Unicode Character Database (UCD),  the property
Diacritic is attached to
-
- U+3099 COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK
- and
U+309A COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK

Getting rid of them does not make sense.
-
- Regards,
- Makoto
-
-

2026年4月8日(水) 20:09 Andrew Cunningham <lang.support@gmail.com>:

>
>
> On Tue, 7 Apr 2026 at 03:56, Addison Phillips <addisoni18n@gmail.com>
> wrote:
>
>> It would be useful to know what they are actually trying to achieve.
>> Sometimes "removing diacritics" is a naive thing that (for example)
>> English speakers try to do (because, generally speaking, they are
>> affectations in English).
>>
>
>
> I'd assume they are referring to languages that normally aren't marked,
> but can be marked for pedagogical reasons or to add clarity. Arabic,
> Lithuanian and a range of African languages come to mind.
>
> There are no lists of such languages. It would also have to be orthography
> specific not just language specific.
>
> The only language independent way of achieving this that would also work
> with any tech stack would be having both versions of the text stored and
> switching between them.
>
>
>>
>> The meaning of "diacritic" itself is complex. Some diacritics alter or
>> hint the pronunciation of the base letter. Other diacritics are used to
>> form an entirely different letter. Diacritics are not just used with the
>> Latin script. There is also the tendency to confuse "combining mark"
>> with "diacritic". Without knowing what or why, it's difficult to make
>> progress--and there might be better approaches than removing information
>> from the text.
>>
>> Look forward to the conversation.
>>
>> Addison
>>
>> On 4/6/2026 5:39 AM, Fuqiao Xue wrote:
>> > The WCAG 3 Text & Wording subgroup is defining use of diacritics for
>> > languages "where they are optional". Here's their current
>> > draft/working document for that provision:
>> >
>> >
>> https://docs.google.com/document/d/1z_Xuava_GS-Fwfk4Hg8KYDr1WcjgcuswKmTELukzvwo/edit?usp=sharing
>> >
>> >
>> > They are asking us to help them on principles or practices that may
>> > guide this work.
>> >
>> > Some of the specific concerns are around:
>> >
>> > 1. Identifying the applicable languages. Is there a list, or
>> > especially some programmatic standard to identify those?
>> > 2. How assistive technology actually handles (or should handle!) cases
>> > like this. Is requiring the full-diacritic version the right answer?
>> > 3. Expectations around burden/effort. It was brought up that having
>> > both versions in a datastore, and a user-visible toggle, is a big
>> change.
>> >
>> > They are happy to answer questions, or have a joint call to talk about
>> > this.
>> >
>> > Any thoughts?
>> >
>> --
>> Internationalization is not a feature.
>> It is an architecture.
>>
>>
>>
>
> --
> Andrew Cunningham
> lang.support@gmail.com
>
>
>

Received on Wednesday, 8 April 2026 12:36:08 UTC