- From: Jon Hanna <jon@spin.ie>
- Date: Mon, 15 Jul 2002 14:21:22 +0100
- To: <w3c-wai-ig@w3.org>
> > markup elements and the formal symbol for a TM sign is, > > > > ™ (™) > > No, 153 is an unused code point in HTML. What is defined in HTML 4 is > as follows: I think the source of confusion is probably the practices of many systems of using the unused character points in the lower byte-ranges of UCS for various characters. This is a valid way of constructing a character set, as long as one doesn't claim to be using Latin-1 etc. It is also reasonable when encountering an invalid character to output one based on such a practice, as part of the general principal that browsers should attempt to act as the author most likely intended when encountering an obvious error. IE on Windows will interpret a character claiming to be UCS point 153 as a trademark sign. Of course misinterpreting character encodings can lead to subtle security problems, as well as preventing authors from realising they are in error (quite a few bugs in IIS of late were due to differening interpretations of UTF-8)
Received on Monday, 15 July 2002 09:21:11 UTC