Re: What about  ?

Arnoud (galactus@stack.urc.tue.nl)
Thu, 01 Aug 1996 20:37:11 +0200


From: galactus@stack.urc.tue.nl (Arnoud "Galactus" Engelfriet)
To: www-html@w3.org
Subject: Re: What about  ?
Date: Thu, 01 Aug 1996 20:37:11 +0200
Message-ID: <XlPAy4uYOxuc089yn@stack.urc.tue.nl>

In article <960801103128_100320.1303_JHF60-1@CompuServe.COM>,
Jonathan Rosenne <100320.1303@CompuServe.COM> wrote:
> Arnoud "Galactus" Engelfriet wrote:
> >Then how do we define &trade;? I don't see anything wrong with thinking
> >up new entities, as long as you unambigously define which entity it
> >should be.
> 
> &trade; is the UCS-4 character &#8482; or a reasonable representation of
> it. &nbsp;

Ok, bad example. What I was trying to say is that the definition of the
entity is what matters. If the HTML specs were to say that &emspace;
would correspond to &#8402; that would be perfectly valid. It would just
be a bad name for this character.
IOW, if the specs say &nbsp; should (not) collapse, then that's what
should happen.

> >Yes, but is there a *reason* for &nbsp; not to get collapsed, like the
> >normal space? 
> 
> There are reasons, and this is why so many implementations - HTML and others - treat
> NBSP the way they do. The main reason is the logic that NBSP is treated just like 
> any other graphic character -- it does not make much sense to invest in special
> logic for a feature that was included for compatibility and the use of which is 
> discouraged. Another reason is that this is what many authors expect.

I have never understood the reason as to why HTML 2 says that use of
the non-breaking space is discouraged. Perhaps it was a valid concern
at the time of writing, when few browsers supported it. But now almost
every browser does.

In any case, the logic of parsing &nbsp; in a collapsing way doesn't seem
too hard to me. Basically you regard the words before and after it as one
word, AFTER having collapsed multiple spaces between the words. If it
occurs at the beginning of a paragraph, just kill it.

I would say that cases like "&nbsp; &nbsp;" are implementation dependant
and should be avoided.

> >In my opinion, (as well as the HTML 3 draft's), the
> >non-breaking space is simply a space where the line should not be
> >broken. If it occurs at a location away from the line end, it should
> >be treated as a normal space, including the collapsing.
> 
> This is a valid view when one ignores the installed base and common practice.

The common practice is only because that's what the current browsers
(incorrectly, IMO) do with it. I don't see why this would mean what
the browsers do is right: Try feeding the following comment to any
current browser.

<!-- I'm a comment -- --> foozlebib -->

The "installed base" and "current practice" demand that the '> foozlebib'
text should be displayed, even though this is technically part of the
comment.

Galactus

-- 
To find out more about PGP, send mail with HELP PGP in the SUBJECT line to me.
E-mail: galactus@stack.urc.tue.nl - Please PGP encrypt your mail if you can.
Finger galactus@turtle.stack.urc.tue.nl for public key (key ID 0x416A1A35).
Anonymity and privacy site: <http://www.stack.urc.tue.nl/~galactus/remailers/>