Re: [Encoding] false statement [I18N-ACTION-328][I18N-ISSUE-374]

John C Klensin scripsit:

> institutionalizing incompatibility has rarely turned out to be a good
> idea.  Sometimes it happens and we have to work around it, sometimes
> those workarounds are successful, but even that rarely changes the
> "bad idea" part.

Both the IETF and the W3C have tried to eliminate HTML/SGML
incompatibility, but it's beyond the power of either (or any other
organization) to do so, because of the dead weight of legacy.

> If I had a script that wasn't supported by Unicode, I'd be unlikely
> to write a proposal to get it coded and then sit around waiting for
> years waiting for them to do it.  However, I would write the proposal
> and, when I created an interim system, I'd try to make sure there was
> a migration plan and, ideally, that my interim system didn't conflict
> with anyone else's.

Sure, if you're operating in a closed system that's the Right Thing, but
the web is an open system.  Font-kludges are the best of a bad set of
solutions.

> go back to 2022

Say it ain't so, Joe!

> If the Unicode Consortium understands and is convinced that this has
> become a serious problem, perhaps they could start conditionally
> reserving some blocks for as-yet-uncoded scripts

That is already done: see <http://www.unicode.org/roadmaps>.  But
institutionally the Consortium cannot bless in any way the use of
unassigned codepoints, because the backlash for doing so is severe:
the users insist that the assumptions they have made must be supported
going forward, unless they are told repeatedly "Don't even try to use
unassigned codepoints at the peril of your souls."

> Any of those approaches (at least the ones I can think of) would be
> very ugly, but far preferable to disguising a lot of one-off font
> tricks or pseudo-Unicode, with potentially overlapping code points,
> as Standard UTF-8 and hoping that the end systems can sort out what
> is going on without any in-stream clues.  That just leads to a very
> fragmented environment in which people cannot communicate... or worse.

Actually, people using language X through a single font-kludge or
an array of kludge fonts *can* communicate with others in the same
situation.  It's the broader world that doesn't know what they are doing
with them.

> If the official Unicode Consortium position were really "people should
> just wait to use their languages until we get around to assigning code
> points and we reserve the right to take as many years as we like"

I of course am in no way official, but if you reword the Unicode
Consortium position to "People will just have to wait to use Unicode
to represent their languages until we have the time, money, and energy
to design a suitable encoding that can persist for centuries.  Given
the current backlog of scripts, that may take a very long time,
unfortunately for all."  What is more, there is a legitimate concern
among at least certain members of the Consortium that their funders will
decide at some point that all "commercially necessary" languages have
been encoded, and money for the rest will not be forthcoming.  So far
that hasn't happened, but "life is not a stranger to uncertainty".

If anyone thinks this uncertainty is a Bad Thing and wants to put
their money where their mouth is, the Script Encoding Initiative
<http://linguistics.berkeley.edu/~dwanders> has facilities for
accepting donations toward the encoding of minority and historic
scripts.

-- 
John Cowan          http://www.ccil.org/~cowan        cowan@ccil.org
Nobody expects the RESTifarian Inquisition!  Our chief weapon is
surprise ... surprise and tedium  ... tedium and surprise ....
Our two weapons are tedium and surprise ... and ruthless disregard
for unpleasant facts....  Our three weapons are tedium, surprise, and
ruthless disregard ... and an almost fanatical devotion to Roy Fielding....

Received on Monday, 1 September 2014 01:16:57 UTC