W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2000

Re: Help on "whitespace"

From: P. T. Rourke <ptrourke@mediaone.net>
Date: Thu, 20 Jan 2000 10:54:12 -0500
Message-ID: <000a01bf635e$9d900400$c38ee9d8@psicorp.com>
To: "Martin Brunecky" <mbrunecky@onerealm.com>, <html-tidy@w3.org>
According to the List-Owner's own specification of HTML 3.2
( http://www.w3.org/TR/REC-html32.html ; sometimes a little easier to read
than the current 4.01 recommendation, especially for students):

"Except within literal text (e.g. the PRE element), HTML treats contiguous
sequences of white space characters as being equivalent to a single space
character (ASCII decimal 32). These rules allow authors considerable
flexibility when editing the marked-up text directly. "

Another way to put it is that any sequence of 1 or more blank spaces and 0
or more hard returns is treated as a single white space unless it occurs
within an element classed as literal text (like PRE); however, the &nbsp;
(and its numeric equivalent) entity, though displayed as a space, is treated
for the purposes of text flow as a non-space character, and so spaces before
and after an &nbsp; count as a space additional to the &nbsp;  .

0 blank spaces and 1 or more hard returns usually behave like a white space,
except between most element tags (e.g., between a </ td> and a <td>, or a
</p> and a <p>), where they're ignored. A special situation: when images are
to occur side by side, most browsers require that the close of the first
image tag be immediately followed by the open of the second image tag, like

<img src="one.png" /><img src="two.png" />

or so

<img src="one.png" /><img
src="two.png" />

not so

<img src="one.png" />
<img src="two.png" />

because some browsers will add a white space between the images.  (The
quotation marks around the attribute value and the slash at the end of the
empty elements are XML-valid conventions that as far as I know all browsers
in use accept; see the XHTML 1.0 proposed rec.). I don't know whether that
behavior (the hard return between images = white space behavior) is
accounted in the Recommendation or not; it is analogous to the way a hard
return is treated between text strings; for instance,


is treated as two words, not one. Thus I imagine this behavior has something
to do with the way image elements are described in the DTD.

P. T. Rourke

----- Original Message -----
From: Martin Brunecky
To: html-tidy@w3.org
Sent: Thursday, January 20, 2000 9:33 AM
Subject: Help on "whitespace"

can someone point me to a document (document portion) or an article
which clearly defines the rules for treatment of white-space and
new-line characters within the HTML source ? Including the
rules for using entities such as &nbsp;

All my HTML books seem to be really vague about this subject, and
I am probably stupid 'cause I don't see how to get this info from the
Received on Thursday, 20 January 2000 10:54:30 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:47 UTC