>> Should we really have such a method on document though? And what is  
>> the
>> reason for using U+.... in the first place. Can't we just always  
>> return
>> the Unicode scalar value?
> The Unicode scalar value is the "U+xxxx" format (the code point).   
> You might have meant the character value.  We have already decided  
> that the character value (if it exists) will be the attribute value.
> There are potential use cases for getting each of the different  
> formats (for example, for Unicode code points, making sure that a  
> character is in a certain range, or presenting an advanced virtual  
> keyboard, or signaling non-printing diacritics).

If you get a string of the character, you can very easily get the  
unicode value of the character as a number in almost any reasonable  
programming language. It's actually harder to parse out of the U+xxxx  
format. The conversion is only useful if there are keys with a U+xxxx  
equivalent where the name is not just that very unicode character.


