- From: Paul Nelson (ATC) <paulnel@winse.microsoft.com>
- Date: Mon, 19 Feb 2007 13:56:20 -0800
- To: Chris Lilley <chris@w3.org>
- CC: Bert Bos <bert@w3.org>, Bjoern Hoehrmann <derhoermi@gmx.net>, <www-style@w3.org>
I concur. The more we can follow processing as defined by Unicode (e.g. using U+FFFD) the better common behavior we can have across UAs, and the less we have to put into our specs about such processing that needs to be maintained to keep in sync with Unicode processing standards. The challenge with converting to U+FFFD during reading the document in from the source is that the backing store will then contain the U+FFFD instead of the malformed stream...which may or may not be okay. Paul -----Original Message----- From: Chris Lilley [mailto:chris@w3.org] Sent: Tuesday, February 20, 2007 5:50 AM To: Paul Nelson (ATC) Cc: Bert Bos; Bjoern Hoehrmann; www-style@w3.org Subject: Re: [CSS21] out of range unicode escapes On Monday, February 19, 2007, 10:32:47 PM, Paul wrote: PNA> The missing glyph is a rendering artifact. When one copies and pastes PNA> they should be getting the badly formed backing store, not what is PNA> rendered. Yes, I was aware of the difference between the backing store and the rendering. That is what prompted my question. There is a malformed css stylesheet, which contributes {something} to the backing store. The rendering of {something} is described; the {something} itself is not described by the proposed text. To amplify what I take to be your proposal below, U+FFFD is "replacement character", is noted by Unicode as "used to represent an incoming character whose value is unknown or unrepresentable in Unicode" and would thus be suitable for this purpose. http://www.unicode.org/charts/PDF/UFFF0.pdf I would much rather see the processing of a malformed escape in terms of what character is used (its rendering then being what the appropriate font does for replacement character) rather than some CSS-specific alternative defined only in terms of how it renders visually. PNA> Paul PNA> -----Original Message----- PNA> From: www-style-request@w3.org [mailto:www-style-request@w3.org] On PNA> Behalf Of Chris Lilley PNA> Sent: Tuesday, February 20, 2007 5:24 AM PNA> To: Bert Bos PNA> Cc: Bjoern Hoehrmann; www-style@w3.org PNA> Subject: Re: [CSS21] out of range unicode escapes PNA> On Monday, February 19, 2007, 5:18:15 PM, Bert wrote: BB>> On Friday 12 January 2007 16:35, Paul Nelson (ATC) wrote: >>> Any data outside the range of valid Unicode is not defined. To be >>> consistent with handling bad UTF-8, we should probably specify >>> changing it into the replacement character. >>> >>> Paul >>> >>> -----Original Message----- >>> From: www-style-request@w3.org [mailto:www-style-request@w3.org] On >>> Behalf Of Bjoern Hoehrmann Sent: Friday, January 12, 2007 6:52 AM >>> To: www-style@w3.org >>> Subject: [CSS21] out of range unicode escapes >>> >>> >>> Hi, >>> >>> The current CSS 2.1 draft does not address handling of Unicode >>> escapes that appear to be above U+10FFFF like \FFFFFF. Such a >>> sequence could be interpreted as 5-digit escape followed by 'F', or >>> be considered invalid, or handled as if it was the replacement >>> character \FFFD, or in other ways. Implementations do not agree on >>> how to handle this case. BB>> The CSS WG discussed the issue and decided only on the principle PNA> that a BB>> UA that displays the character in any way *should* display some PNA> visible BB>> symbol, similar to how it should handle legal characters for which PNA> no BB>> font is available. BB>> The next draft will contain this paragraph at the end of the 3rd PNA> bullet BB>> in 4.1.3 : BB>> If the number is outside the range allowed by Unicode (e.g., BB>> "\110000" is above the maximum 10FFFF allowed in current PNA> Unicode), BB>> the UA may replace the escape with the "replacement character" BB>> (U+FFFD). If the character is to be displayed, the UA should PNA> show a BB>> visible symbol, such as a "missing character" glyph (cf. 15.2, PNA> point BB>> 5). BB>> Please let us know if this solves the issue. BB>> [For reference: we put this issue in the planned "disposition of BB>> comments" document as "issue 19."] PNA> If you copy a section of text which includes this 'missing glyph' and PNA> paste the characters into a text editor, what character do you get PNA> there? -- Chris Lilley mailto:chris@w3.org Interaction Domain Leader Co-Chair, W3C SVG Working Group W3C Graphics Activity Lead Co-Chair, W3C Hypertext CG
Received on Monday, 19 February 2007 21:55:57 UTC