W3C home > Mailing lists > Public > www-international@w3.org > April to June 2014

Re: Java implementations of RFC 4647 extended filtering

From: John Cowan <cowan@mercury.ccil.org>
Date: Tue, 6 May 2014 13:50:16 -0400
To: Jeremy J Carroll <jjc@syapse.com>
Cc: "Phillips, Addison" <addison@lab126.com>, "Mark Davis ☕ (mark@macchiato.com)" <mark@macchiato.com>, "www-international@w3.org" <www-international@w3.org>
Message-ID: <20140506175016.GQ5011@mercury.ccil.org>
Jeremy J Carroll scripsit:

> if I have a fr-CN analyzer, and text tagged as fr-LATN-CN then the
> lookup algorithm fails and the filtering algorithm would not.

That's true, which is why we have the admittedly incomplete
Suppress-Script information in the LSTR; you can look for "fr-Latn"
and change it to "fr" before matching.

With filtering, though, if you have "fr" text and all three analyzers,
you will get "fr-FR" and "fr-CA" returned, with no guidance about which
to use.

John Cowan          http://www.ccil.org/~cowan        cowan@ccil.org
Mark Twain on Cecil Rhodes: I admire him, I freely admit it,
and when his time comes I shall buy a piece of the rope for a keepsake.
Received on Tuesday, 6 May 2014 17:50:38 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:41:05 UTC