Re: Soft hyphen (Re: Cougar comments)

David Perrell (davidp@earthlink.net)
Mon, 12 May 1997 10:30:21 -0700


Message-Id: <199705121737.KAA20768@germany.it.earthlink.net>
From: "David Perrell" <davidp@earthlink.net>
To: "Jonathan Rosenne" <rosenne@NetVision.net.il>
Cc: "Martin J. Duerst" <mduerst@ifi.unizh.ch>,
Subject: Re: Soft hyphen (Re: Cougar comments)
Date: Mon, 12 May 1997 10:30:21 -0700

Jonathan Rosenne wrote:
> The distinction is already blurred. What about SPACE? I remember the
ISO
> committees used to argue whether SPACE was a graphic character or a
control
> character. In HTML it is not a simple graphic character either. SHY
and
> NBSP also are "in-between". The fact that nearly nobody implements
SHY and
> NBSP correctly according to 8859 is beside the point.

IMO, the treatment of spaces and line breaks in HTML documents by user
agents is more analogous to the use of soft hyphens I described. Text
in an HTML document is in fact pre-formatted, and must be re-formatted
for display or print.

It is not so much the possible use of #173 as a 'potential hyphen'
marker that I object to, it's the formalization of that use in a
general character set description. A SPACE receives special treatment
in HTML to accommodate a need for structure in the marked-up file, and
a SPACE character's metrics may be overridden to produce justified text
for display. But SPACE does have default metrics, and it needs no
special display caveats in the description of the character set. If
#173 and &shy are to be treated as 'potential hyphen' markers in HTML
documents, the description of that use should be specific to the
processing of HTML documents, not documents in general.

> Anyhow, I think SHY and NBSP represent obsolete solutions to the
problems
> they intended to address, and today markup based solutions seem to be
more
> appropriate.
> 
> For example, I don't think it is useful or friendly to go over a
document
> and insert SHY in all occurances of a certain word, should I wish to
be
> sure your browser hyphenates it the way I want. It would be nicer if
I
> could declare it just once. Also, I need a way of saying "do not
hyphenate
> this word", which SHY cannot do.

The reason for special treatment of newlines and spaces in HTML
documents is to maintain legibility and editability. Pre-hyphenated
HTML would be a mess. Pre-hyphenated documents might as well have
binary markup to save on bandwidth.

David Perrell