- From: Henri Sivonen <hsivonen@niksula.hut.fi>
- Date: Sun, 15 Sep 2002 14:55:17 +0300
- To: www-style@w3.org
On Saturday, Aug 31, 2002, at 16:02 Europe/Helsinki, Ian Hickson wrote:
> On Fri, 30 Aug 2002, Peter Sheerin wrote:
>>
>> I believe that the fonts module must specify a suggested behavior
>> when faced with a document that specifies a character it can not
>> render because no glyph is available.
>
> I agree.
I agree that it would be good to discourage the use of the question
mark as a generic surrogate. However, I think the implementation
details should follow the OS practice and not be normatively specified
in CSS when there is an OS practice.
* Using the OS practice helps the user understand that a missing
character
is being represented, because the behavior is consistent with the
behavior
of other apps.
* The OS text engine may do the fallback internally, which is likely
to be
more efficient than application-side fallback.
* The OS practice may be better than using a single replacement
character,
but the better method may not be available on all target platforms
of CSS.
For example, Mac OS X comes with a last resort font that contains
generic fallback characters for each Unicode block. With the last
resort font the user has a better idea about what is missing.
Screenshot:
http://www.niksula.cs.hut.fi/~hsivonen/typography/last-resort.png
I'd like to see something like this in the spec:
User agents MUST NOT use U+003F QUESTION MARK as a fallback
representation when a glyph for a given character is missing. If a user
agent is running on a platform that has a convention specifically
designed for representing Unicode characters for which glyphs are
unavailable, the user agent SHOULD follow the platform convention.
Otherwise, user agents SHOULD use U+FFFD REPLACEMENT CHARACTER as the
fallback representation.
>> Also, the set of characters specified in the current HTML DTDs is
>> not really sufficient to display many important characters, [...]
>
> HTML4 references ISO10646 which means it has every UNICODE character.
> Ditto XML. Do you want HTML to have actual _named entities_ for all
> 16000+ characters? That simply doesn't scale.
I'm inclined to consider named character entities harmful, because they
move an input problem to the user agent that is displaying the document
and increase parsing complexity by requiring the XML parser to process
the external DTD subset (in the usual case) even when the document
could otherwise be treated as a standalone document. I think dealing
with the issue on the input method level on the author's system makes
more sense.
--
Henri Sivonen
hsivonen@niksula.hut.fi
http://www.hut.fi/u/hsivonen/
Received on Sunday, 15 September 2002 07:55:58 UTC