- From: Chris Lilley <chris@w3.org>
- Date: Mon, 19 Feb 2007 23:16:56 +0100
- To: "Paul Nelson (ATC)" <paulnel@winse.microsoft.com>
- Cc: Bert Bos <bert@w3.org>, Bjoern Hoehrmann <derhoermi@gmx.net>, <www-style@w3.org>
On Monday, February 19, 2007, 10:56:20 PM, Paul wrote: PNA> I concur. The more we can follow processing as defined by Unicode (e.g. PNA> using U+FFFD) the better common behavior we can have across UAs, and the PNA> less we have to put into our specs about such processing that needs to PNA> be maintained to keep in sync with Unicode processing standards. PNA> The challenge with converting to U+FFFD during reading the document in PNA> from the source is that the backing store will then contain the U+FFFD PNA> instead of the malformed stream...which may or may not be okay. I think its preferable to have content generated from a malformed escape like \22FFFF be U+00FFFD than either the literal string "\22FFFF" (coerced using some special rule to display as the missing glyph) or alternatively an invalid code point, which text processing engines then have to be specially coded to not break on encountering. PNA> Paul PNA> -----Original Message----- PNA> From: Chris Lilley [mailto:chris@w3.org] PNA> Sent: Tuesday, February 20, 2007 5:50 AM PNA> To: Paul Nelson (ATC) PNA> Cc: Bert Bos; Bjoern Hoehrmann; www-style@w3.org PNA> Subject: Re: [CSS21] out of range unicode escapes PNA> On Monday, February 19, 2007, 10:32:47 PM, Paul wrote: PNA>> The missing glyph is a rendering artifact. When one copies and PNA> pastes PNA>> they should be getting the badly formed backing store, not what is PNA>> rendered. PNA> Yes, I was aware of the difference between the backing store and the PNA> rendering. That is what prompted my question. PNA> There is a malformed css stylesheet, which contributes {something} to PNA> the backing store. The rendering of {something} is described; the PNA> {something} itself is not described by the proposed text. PNA> To amplify what I take to be your proposal below, U+FFFD is PNA> "replacement character", is noted by Unicode as "used to represent an PNA> incoming character whose value is unknown or unrepresentable in PNA> Unicode" and would thus be suitable for this purpose. PNA> http://www.unicode.org/charts/PDF/UFFF0.pdf PNA> I would much rather see the processing of a malformed escape in terms PNA> of what character is used (its rendering then being what the PNA> appropriate font does for replacement character) rather than some PNA> CSS-specific alternative defined only in terms of how it renders PNA> visually. PNA>> Paul PNA>> -----Original Message----- PNA>> From: www-style-request@w3.org [mailto:www-style-request@w3.org] On PNA>> Behalf Of Chris Lilley PNA>> Sent: Tuesday, February 20, 2007 5:24 AM PNA>> To: Bert Bos PNA>> Cc: Bjoern Hoehrmann; www-style@w3.org PNA>> Subject: Re: [CSS21] out of range unicode escapes PNA>> On Monday, February 19, 2007, 5:18:15 PM, Bert wrote: BB>>> On Friday 12 January 2007 16:35, Paul Nelson (ATC) wrote: >>>> Any data outside the range of valid Unicode is not defined. To be >>>> consistent with handling bad UTF-8, we should probably specify >>>> changing it into the replacement character. >>>> >>>> Paul >>>> >>>> -----Original Message----- >>>> From: www-style-request@w3.org [mailto:www-style-request@w3.org] On >>>> Behalf Of Bjoern Hoehrmann Sent: Friday, January 12, 2007 6:52 AM >>>> To: www-style@w3.org >>>> Subject: [CSS21] out of range unicode escapes >>>> >>>> >>>> Hi, >>>> >>>> The current CSS 2.1 draft does not address handling of Unicode >>>> escapes that appear to be above U+10FFFF like \FFFFFF. Such a >>>> sequence could be interpreted as 5-digit escape followed by 'F', or >>>> be considered invalid, or handled as if it was the replacement >>>> character \FFFD, or in other ways. Implementations do not agree on >>>> how to handle this case. BB>>> The CSS WG discussed the issue and decided only on the principle PNA>> that a BB>>> UA that displays the character in any way *should* display some PNA>> visible BB>>> symbol, similar to how it should handle legal characters for which PNA>> no BB>>> font is available. BB>>> The next draft will contain this paragraph at the end of the 3rd PNA>> bullet BB>>> in 4.1.3 : BB>>> If the number is outside the range allowed by Unicode (e.g., BB>>> "\110000" is above the maximum 10FFFF allowed in current PNA>> Unicode), BB>>> the UA may replace the escape with the "replacement character" BB>>> (U+FFFD). If the character is to be displayed, the UA should PNA>> show a BB>>> visible symbol, such as a "missing character" glyph (cf. 15.2, PNA>> point BB>>> 5). BB>>> Please let us know if this solves the issue. BB>>> [For reference: we put this issue in the planned "disposition of BB>>> comments" document as "issue 19."] PNA>> If you copy a section of text which includes this 'missing glyph' PNA> and PNA>> paste the characters into a text editor, what character do you get PNA>> there? -- Chris Lilley mailto:chris@w3.org Interaction Domain Leader Co-Chair, W3C SVG Working Group W3C Graphics Activity Lead Co-Chair, W3C Hypertext CG
Received on Monday, 19 February 2007 22:17:11 UTC