W3C home > Mailing lists > Public > www-html-editor@w3.org > July to September 2003

[HTML 4.01] Clarification about hyphen characters (9.3.3)

From: Vincent Lefevre <vincent@vinc17.org>
Date: Wed, 13 Aug 2003 10:32:31 +0200
To: www-html-editor@w3.org
Message-ID: <20030813083231.GA32700@ay.iaks.uka.de>


The HTML 4.01 specification says:

In HTML, there are two types of hyphens: the plain hyphen and the soft
hyphen. [...]

In HTML, the plain hyphen is represented by the "-" character (&#45;
or &#x2D;). The soft hyphen is represented by the character entity
reference &shy; (&#173; or &#xAD;)

But ISO10646/Unicode (the character set for HTML 4.01) contains other
hyphen characters:

2011;NON-BREAKING HYPHEN;Pd;0;ON;<noBreak> 2010;;;;N;;;;;

Moreover, there are probably better than the overloaded ASCII "-",
in particular if the user wants a hyphen character that can be
broken across lines (for compound words).

I think that Section 9.3.3 should be clarified about the use of hyphen
characters: either mention U+2010 and U+2011 (or possibly say that the
current list is not exhaustive) or explicitly forbid hyphen characters
other than U+002D and U+00AD.


Vincent Lefèvre.
Received on Thursday, 14 August 2003 02:33:26 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:39:41 UTC