W3C home > Mailing lists > Public > site-comments@w3.org > October 2001

Re: special character "SM"

From: Ian B. Jacobs <ij@w3.org>
Date: Mon, 08 Oct 2001 13:35:35 -0400
Message-ID: <3BC1E3E7.7A901630@w3.org>
To: "Deborah J. Dorsey" <ddorsey@progress.com>
CC: site-comments@w3.org, mimasa@w3.org
"Deborah J. Dorsey" wrote:
> 
> Hello,
> 
> I cannot locate the ASCII code for the Service Mark "SM" (similar to
> TradeMark, "TM") anywhere on your site.

Hi there,

I would like to reformulate your question (please let me
know whether this is correct): How do you represent a
service mark character in HTML and XHTML?

Short answer: There is a standard way that does not seem to
be supported by many browsers (that I use). There may be a non-standard
way, but I don't know it.

Longer answer:

Both of these languages use Unicode [0] as the document character
set, since the set of ASCII characters is not rich enough
to represent the world's texts (check out section 5 of
the HTML 4.01 specification for more information [1]).

However:

 - Neither ASCII nor ISO 8859-1 include "tm" or "sm", but 
   ISO 8859-1 includes "(R)" and "(C)".
 - Unicode includes all of them. To find them, I went to the 
   Unicdoe code charts [2], and in particular, the chart on
   letterlike symbols [3] (a PDF file).

  The chart tells me that "tm" has character code 2122 and "sm"
  has character code 2120. These are hex numbers.

Chapter 5 [1] of the HTML 4.01 spec explains how to represent
Unicode characters when you know their hex character codes. 
For instance, &#x2120; represents the "sm" (letterlike) character.

So in an HTML or XHTML document, you can write:

 <p>This is my service mark<sup>&#2120;</sup>.</p>

However, I doubt that many browsers will actually render the
desired "sm" character. 

Chapter 24 of HTML 4.01 includes some abbreviations for
frequently used characters (called "character entity
references"). Thus, &reg; is recognized as the registered
trade mark sign (whose hex representation is &#x174;). I
don't see any character entity references for "service mark"
in HTML 4.01.

Hence, HTML 4.01 allows you to represent the Unicode character
(for "sm"), but my various Linux browsers don't render it correctly.
Actually, the text browser Lynx does: it's represented as "(SM)".

 - Ian

[0] http://www.unicode.org/
[1] http://www.w3.org/TR/html401/charset
[2] http://www.unicode.org/charts/
[3] http://www.unicode.org/charts/PDF/U2100.pdf
[4] http://www.w3.org/TR/html401/sgml/entities.html
 
> If none exists, who do I contact about requesting this be added?  It's
> not efficiently added to HTML pages now since it needs to remain in
> cap's and be superscripted.
> 
> Thank you,
> Deb Dorsey

-- 
Ian Jacobs (ij@w3.org)   http://www.w3.org/People/Jacobs
Tel:                     +1 718 260-9447
Received on Monday, 8 October 2001 13:37:20 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 24 October 2012 16:21:26 GMT