Re: Standardizing on IDNA 2003 in the URL Standard from John Cowan on 2013-08-21 (uri@w3.org from August 2013)

From: John Cowan <cowan@mercury.ccil.org>
Date: Wed, 21 Aug 2013 17:30:12 -0400
To: John C Klensin <klensin@jck.com>
Cc: Shawn Steele <Shawn.Steele@microsoft.com>, Gervase Markham <gerv@mozilla.org>, "Jungshik SHIN (신정식)" <jshin1987@gmail.com>, Simon Montagu <smontagu@smontagu.org>, public-iri@w3.org, uri@w3.org, idna-update@alvestrand.no, Peter Saint-Andre <stpeter@stpeter.im>, Anne van Kesteren <annevk@annevk.nl>, "www-tag.w3.org" <www-tag@w3.org>
Message-ID: <20130821213012.GF23853@mercury.ccil.org>

John C Klensin scripsit:

> That would have meant no separate code point for a final sigma in
> Greek;

See my earlier post for why that's not possible.

> no separate code points for final Kaf, Mem, Nun, Pe, or Tsadi in
> Hebrew;

This is even less possible.  In Hebrew, a pe at the end of a word is
always /f/.  But in Yiddish, there is a contrast between /p/ and /f/ in
final position that Hebrew does not have, and in that case a non-final
pe in final position is used for /p/, sometimes but not always with a
dagesh (embedded dot).

In short, Greek and Hebrew positional variants require AI-hard algorithms.

-- 
John Cowan  cowan@ccil.org  http://ccil.org/~cowan
If I have seen farther than others, it is because I am surrounded by dwarves.
        --Murray Gell-Mann

Received on Wednesday, 21 August 2013 21:30:41 UTC