- From: John Cowan <cowan@mercury.ccil.org>
- Date: Thu, 14 Nov 2013 13:29:52 -0500
- To: "Henry S. Thompson" <ht@inf.ed.ac.uk>
- Cc: "Joe Hildebrand (jhildebr)" <jhildebr@cisco.com>, "www-tag@w3.org" <www-tag@w3.org>, Paul Hoffman <paul.hoffman@vpnc.org>, Pete Cordell <petejson@codalogic.com>, JSON WG <json@ietf.org>
Henry S. Thompson scripsit: > (There are, it has to be said, few Unicode characters whose UTF-16-L > form is 00xx, i.e. U+xx00, the first code point on a code page -- > I had to hunt pretty hard to find the above specimen, which is in > fact a slight cheat :-) Many code pages have a gap at the 00 point. There are 68 of them on the Basic Multilingual Plane. But many characters in other planes involve such 16-bit code units. For example, all of U+10000 to U+103FF are encoded as D800 DC00 through D800 DFFF. Currently there are 622 characters in this range alone, and the number will probably grow. > Not sure about the status of U+4E00, one variant of the ideograph for > the numeral 1). Google reports over 3 gigahits for this character. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Men changed the language that they learned of elves in the days when all the world was wonderful. --The Hobbit
Received on Thursday, 14 November 2013 18:30:20 UTC