Re: [Encoding] false statement [I18N-ACTION-328][I18N-ISSUE-374]

John C Klensin scripsit:

>  -- the Standard UTF-8 encoding, i.e., what a
> 	standard-conforming UTF-8 encoder would produce given a
> 	list of code points (assigned or not),  or

AFAIK this is not a big problem today.

>  -- the Unicode code point assignments, i.e., it uses
> 	private code space and/or "squats" on unassigned code
> 	points, perhaps in so-far completely unused or sparcely
> 	populated planes, or

This one is common, and may even involve repurposing existing code point
assignments.  This usually happens because the font makes assumptions
that a given code point will remain unassigned, or will be assigned in
the way the font author expects -- assumptions which wind up being wrong.

>  -- established Unicode conventions by combining existing
> 	and standardized Unicode points using conventions (and
> 	perhaps special font support) about how those sequences
> 	are interpreted that are not part of the Unicode
> 	Standard.

This is also a problem; generally it's about using visual rather than
logical order of combining characters.

-- 
John Cowan          http://www.ccil.org/~cowan        cowan@ccil.org
Adam [...] did not want the apple for the apple's sake, he wanted it only
because it was forbidden. The mistake was not forbidding the serpent;
then he would have eaten the serpent. --Mark Twain

Received on Thursday, 28 August 2014 22:23:43 UTC