The mapping currently defined for lang to 27E9 (MATHEMATICAL LEFT ANGLE
BRACKET) is definitely the correct one and if the definition of rang is
changed at all from its html4 definition it should be to this and not to

The original definition  was to U+2329 (LEFT-POINTING ANGLE BRACKET)
which is in the 2xxxx block of technical symbols not CJK punctuation,
and the name "rang" comes from ISOTECH entity set for technical
publishing. So this entity has always been intended for mathematical use.

Unfortunately Unicode later specified a canonical mapping of this
character to U+3008 which is definitely intended for CJK punctuation.
This makes the U02329 character essentially unstable, and unicode have
corrected their error as far as possible by deprecating that character and
re-introducing a new one in the technical symbols block that is the same
as the original  U+2329 character but without the unfortunate canonical
mapping to the U+3xxx range.

So the character that is currently in unicode to support the intended
usage of rang is U+27E9. However as Alexey says any change of
definitionis likely to cause some pages to show missing glyph symbols,
until font tables catch up. So not changing anything is another
alterative that could be justified. Changing to U+3008 would be
incorrect, it is precisely to avoid the association with U+3008 that
Unicode introduced a new character.

On balance I think HTML5 is right to change the entity definitions.
Increasingly authoring software will generate documents just using
character data (say utf8 encoded) directly and so rendering engines will
get documents containing the "new" U+27E9 character whatever you (or I)
do about the rang entity. Fonts will need to have  updated tables, but
In a world where new characters are still being added to Uniocde, and
XML and HTML5 allow those characters to be used, there is always a
danger that not all documents can be rendered on all systems.


Received on Friday, 23 May 2008 11:45:36 UTC