W3C home > Mailing lists > Public > public-i18n-core@w3.org > October to December 2015

Re: I18N-ISSUE-505: intro discussion of diacritic matching [find-text]

From: Najib Tounsi <ntounsi@gmail.com>
Date: Thu, 22 Oct 2015 23:21:19 +0000
Message-ID: <56296F6F.7020307@emi.ac.ma>
To: Internationalization Working Group <public-i18n-core@w3.org>
On 10/22/15 3:15 PM, Internationalization Working Group Issue Tracker wrote:
> I18N-ISSUE-505: intro discussion of diacritic matching [find-text]
>
> http://www.w3.org/International/track/issues/505
>
> Raised by: Richard Ishida
> On product: find-text
>
> Intro has sentence:
>
> --
> Browsers do not typically match language patterns that may be found in non-Latin character sets, including collapsed Unicode character sequences, optional diacritical marks, or similar features, such as matching o to ó, ö, ø, and oe.
> --
>
> This seems problematic.

Yes. I've done some test in Arabic (same word, different diacritics). 
Browsers don't have the same result for find-text. In Firefox some words 
match one to one, others are not matched (words with sequence of two 
diacritics like shadda+fatha.)
In Safari, all words with diacritics match one to one. But find a word 
WITHOUT diacritics matches every same word with diacritics (may be 
desirable).

In mobile, it is even différent.

> Maybe this is a feature that is desirable?

Agree. Diacritics change the meaning of words. One may wants to find a 
word (with some diacritics) and not others.

Najib

>   Some mobile implementations may do this.
>
Received on Thursday, 22 October 2015 22:18:54 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 22 October 2015 22:18:55 UTC