W3C home > Mailing lists > Public > w3c-wai-ig@w3.org > October to December 1999

RE: Off Topic -- Hard hyphen?

From: Martin J. Duerst <duerst@w3.org>
Date: Wed, 15 Dec 1999 14:17:33 +0900
Message-Id: <199912150545.OAA25135@sh.w3.mag.keio.ac.jp>
To: "McDonald, Ira" <imcdonald@sharplabs.com>
Cc: "'Al Gilman'" <asgilman@iamdigex.net>, "'Web Accessibility Initiative'"<w3c-wai-ig@w3.org>, "McDonald, Ira"<imcdonald@sharplabs.com>
Of course the non-breaking hyphen is the correct thing to use.
For quotes, never use &#147; and &#148;, this is not correct
HTML at all. The numbers in &#...; constructs refer to Unicode,
and Unicode does not have any characters at these positions.
Please see http://www.w3.org/TR/html40/sgml/entities.html#h-24.4.1
and use any of:

<!ENTITY lsquo   CDATA "&#8216;" -- left single quotation mark,
                                                U+2018 ISOnum -->
<!ENTITY rsquo   CDATA "&#8217;" -- right single quotation mark,
                                                U+2019 ISOnum -->
<!ENTITY sbquo   CDATA "&#8218;" -- single low-9 quotation mark, U+201A NEW -->
<!ENTITY ldquo   CDATA "&#8220;" -- left double quotation mark,
                                                U+201C ISOnum -->
<!ENTITY rdquo   CDATA "&#8221;" -- right double quotation mark,
                                                U+201D ISOnum -->

As for 'implied', the comment in
http://www.w3.org/TR/html40/sgml/entities.html#h-24.3.1
just is a note pointing out the assumption of the authors
of the HTML spec that this was the same as an entity in
some ISO standard with a different name, and that this
assumption was supported by some other ISO document.
Neither are the characters missing from Unicode/iso10646,
and they are even available as *named* character entities
in HTML 4.0.

In case somebody really has a new character to propose
(which I don't think will happen very frequently),
please have a look at
http://www.unicode.org/pending/proposals.html.


Regards,   Martin.

At 11:33 1999/12/13 -0800, McDonald, Ira wrote:
> Hi Al and Bruce,
> 
> In Unicode 2.0 (1996), in Chapter 7 'Code Charts', on page 7-155
> 'General punctuation', the following character is defined:
> 
> U+2011  NON-BREAKING HYPHEN
>         (cross reference) U+002D - HYPHEN-MINUS
> 
> I believe this is what you're looking for.  I've copied Martin
> Duerst (W3C Internationalization leader) to verify.
> 
> Cheers,
> - Ira McDonald (outside consultant at Sharp Labs America)
>   High North Inc
> 
> -----Original Message-----
> From: Al Gilman [mailto:asgilman@iamdigex.net]
> Sent: Friday, December 10, 1999 6:37 PM
> To: 'Web Accessibility Initiative'
> Subject: Re: Off Topic -- Hard hyphen?
> 
> 
> 1. There is the whitespace=nowrap formatting property that in theory should
> accomplish what you are after if you wrap the whole <span
> class='my-nowrap'>hyphenated-expression</span> and then style this class or
> ID with the property whitespace=nowrap.
> 
> 2. Missing characters should go to the Unicode Consortium.  It would be
> good to get in touch with the internationalization, XML InfoSet, and/or
> math groups depending on just what the character is, to find out if there
> are others in W3C that care about this character one way or another.
> 
> Al
> 
> At 05:43 PM 12/10/99 -0500, Bruce Bailey wrote:
> >Is there a character for a "non-breaking hyphen"?  I want a dash that is 
> >treated like any other non-space alphanumeric character:  i.e., one that, 
> >if near the end of a line, causes the line to wrap at the space before the 
> >text that precedes the hyphen rather than just after the hyphen.  An 
> >equivalent Navigator-ism would be <nobr>-</nobr> (not valid html).  &minus;
> 
> >is logical, but is kind of an abuse (and not compatible with the 3.x 
> >browsers).  I know about &shy; why not a &hhy; ?  A proper "en dash" 
> >(&ndash;  or &#8211; ) is a little to large and is not compatible with 3.x 
> >browsers.  I even went so far as to try &#150; -- it's also too big (but it
> 
> >IS compatible with the 3.x browsers) but wraps the same as a regular 
> >dash/hyphen.  (Yes, I know this "illegal" character gets the hackles up of 
> >the Unix crowd -- but I still blame _them_ for leaving fundamental 
> >typographical characters out of the 3.2 character set!)
> >
> >I can't find anything that works.  I am tempted to create a IMG graphic, 
> >but I am sure that it will mess up my line height, this technique is not 
> >scaleable, and I just can't believe I am the only person who needs this.
> >
> >On a related, but perhaps equally off topic question, is there a 
> >straightforward HTTP-EQUIV="Content-Type" statement that would make my use 
> >of "windows" characters between &#127; and &#159; legal?  There was a whole
> 
> >discussion of this earlier, and people generously sent me URLs to academic 
> >discussions of characters sets.  I learned a great deal, but there was a 
> >lot I could not follow.  In particular, I could not discern a truly 
> >cross-platform backwards-compatible way to generate typographical quotation
> 
> >marks.  I am still using &#147; and &#148; and plan to do so until 
> >Navigator and IE support the <Q> ... </Q> construct.
> >
> >What is the correct channel to go through to suggest missing characters? 
> > The official W3C specs themselves point out that the basic mathematical 
> >symbols "implies" and "is implied by" (as well as the more obscure "not a 
> >super set of") are omitted.  URL:
> >http://www.w3.org/TR/html40/sgml/entities.html#h-24.3.1
> >
> >Thank you.
> >-- Bruce Bailey
> > 
> 
> 


#-#-#  Martin J. Du"rst, World Wide Web Consortium
#-#-#  mailto:duerst@w3.org   http://www.w3.org
Received on Wednesday, 15 December 1999 00:45:29 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 19 July 2011 18:13:46 GMT