RE: [CSS21] out of range unicode escapes

The missing glyph is a rendering artifact. When one copies and pastes
they should be getting the badly formed backing store, not what is
rendered.

Paul

-----Original Message-----
From: www-style-request@w3.org [mailto:www-style-request@w3.org] On
Behalf Of Chris Lilley
Sent: Tuesday, February 20, 2007 5:24 AM
To: Bert Bos
Cc: Bjoern Hoehrmann; www-style@w3.org
Subject: Re: [CSS21] out of range unicode escapes


On Monday, February 19, 2007, 5:18:15 PM, Bert wrote:

BB> On Friday 12 January 2007 16:35, Paul Nelson (ATC) wrote:
>> Any data outside the range of valid Unicode is not defined. To be
>> consistent with handling bad UTF-8, we should probably specify
>> changing it into the replacement character.
>>
>> Paul
>>
>> -----Original Message-----
>> From: www-style-request@w3.org [mailto:www-style-request@w3.org] On
>> Behalf Of Bjoern Hoehrmann Sent: Friday, January 12, 2007 6:52 AM
>> To: www-style@w3.org
>> Subject: [CSS21] out of range unicode escapes
>>
>>
>> Hi,
>>
>>   The current CSS 2.1 draft does not address handling of Unicode
>> escapes that appear to be above U+10FFFF like \FFFFFF. Such a
>> sequence could be interpreted as 5-digit escape followed by 'F', or
>> be considered invalid, or handled as if it was the replacement
>> character \FFFD, or in other ways. Implementations do not agree on
>> how to handle this case.

BB> The CSS WG discussed the issue and decided only on the principle
that a
BB> UA that displays the character in any way *should* display some
visible
BB> symbol, similar to how it should handle legal characters for which
no 
BB> font is available.

BB> The next draft will contain this paragraph at the end of the 3rd
bullet
BB> in 4.1.3 :

BB>     If the number is outside the range allowed by Unicode (e.g.,
BB>     "\110000" is above the maximum 10FFFF allowed in current
Unicode),
BB>     the UA may replace the escape with the "replacement character"
BB>     (U+FFFD). If the character is to be displayed, the UA should
show a
BB>     visible symbol, such as a "missing character" glyph (cf. 15.2,
point
BB>     5).

BB> Please let us know if this solves the issue.

BB> [For reference: we put this issue in the planned "disposition of 
BB> comments" document as "issue 19."]

If you copy a section of text which includes this 'missing glyph' and
paste the characters into a text editor, what character do you get
there?



-- 
 Chris Lilley                    mailto:chris@w3.org
 Interaction Domain Leader
 Co-Chair, W3C SVG Working Group
 W3C Graphics Activity Lead
 Co-Chair, W3C Hypertext CG

Received on Monday, 19 February 2007 21:32:12 UTC