- From: Henri Sivonen <hsivonen@niksula.hut.fi>
- Date: Sun, 15 Sep 2002 14:55:17 +0300
- To: www-style@w3.org
On Saturday, Aug 31, 2002, at 16:02 Europe/Helsinki, Ian Hickson wrote: > On Fri, 30 Aug 2002, Peter Sheerin wrote: >> >> I believe that the fonts module must specify a suggested behavior >> when faced with a document that specifies a character it can not >> render because no glyph is available. > > I agree. I agree that it would be good to discourage the use of the question mark as a generic surrogate. However, I think the implementation details should follow the OS practice and not be normatively specified in CSS when there is an OS practice. * Using the OS practice helps the user understand that a missing character is being represented, because the behavior is consistent with the behavior of other apps. * The OS text engine may do the fallback internally, which is likely to be more efficient than application-side fallback. * The OS practice may be better than using a single replacement character, but the better method may not be available on all target platforms of CSS. For example, Mac OS X comes with a last resort font that contains generic fallback characters for each Unicode block. With the last resort font the user has a better idea about what is missing. Screenshot: http://www.niksula.cs.hut.fi/~hsivonen/typography/last-resort.png I'd like to see something like this in the spec: User agents MUST NOT use U+003F QUESTION MARK as a fallback representation when a glyph for a given character is missing. If a user agent is running on a platform that has a convention specifically designed for representing Unicode characters for which glyphs are unavailable, the user agent SHOULD follow the platform convention. Otherwise, user agents SHOULD use U+FFFD REPLACEMENT CHARACTER as the fallback representation. >> Also, the set of characters specified in the current HTML DTDs is >> not really sufficient to display many important characters, [...] > > HTML4 references ISO10646 which means it has every UNICODE character. > Ditto XML. Do you want HTML to have actual _named entities_ for all > 16000+ characters? That simply doesn't scale. I'm inclined to consider named character entities harmful, because they move an input problem to the user agent that is displaying the document and increase parsing complexity by requiring the XML parser to process the external DTD subset (in the usual case) even when the document could otherwise be treated as a standalone document. I think dealing with the issue on the input method level on the author's system makes more sense. -- Henri Sivonen hsivonen@niksula.hut.fi http://www.hut.fi/u/hsivonen/
Received on Sunday, 15 September 2002 07:55:58 UTC