- From: P. T. Rourke <ptrourke@mediaone.net>
- Date: Thu, 20 Jan 2000 10:54:12 -0500
- To: "Martin Brunecky" <mbrunecky@onerealm.com>, <html-tidy@w3.org>
According to the List-Owner's own specification of HTML 3.2 ( http://www.w3.org/TR/REC-html32.html ; sometimes a little easier to read than the current 4.01 recommendation, especially for students): "Except within literal text (e.g. the PRE element), HTML treats contiguous sequences of white space characters as being equivalent to a single space character (ASCII decimal 32). These rules allow authors considerable flexibility when editing the marked-up text directly. " Another way to put it is that any sequence of 1 or more blank spaces and 0 or more hard returns is treated as a single white space unless it occurs within an element classed as literal text (like PRE); however, the (and its numeric equivalent) entity, though displayed as a space, is treated for the purposes of text flow as a non-space character, and so spaces before and after an count as a space additional to the . 0 blank spaces and 1 or more hard returns usually behave like a white space, except between most element tags (e.g., between a </ td> and a <td>, or a </p> and a <p>), where they're ignored. A special situation: when images are to occur side by side, most browsers require that the close of the first image tag be immediately followed by the open of the second image tag, like so <img src="one.png" /><img src="two.png" /> or so <img src="one.png" /><img src="two.png" /> not so <img src="one.png" /> <img src="two.png" /> because some browsers will add a white space between the images. (The quotation marks around the attribute value and the slash at the end of the empty elements are XML-valid conventions that as far as I know all browsers in use accept; see the XHTML 1.0 proposed rec.). I don't know whether that behavior (the hard return between images = white space behavior) is accounted in the Recommendation or not; it is analogous to the way a hard return is treated between text strings; for instance, Mississi ppi is treated as two words, not one. Thus I imagine this behavior has something to do with the way image elements are described in the DTD. P. T. Rourke ----- Original Message ----- From: Martin Brunecky To: html-tidy@w3.org Sent: Thursday, January 20, 2000 9:33 AM Subject: Help on "whitespace" Hi: can someone point me to a document (document portion) or an article which clearly defines the rules for treatment of white-space and new-line characters within the HTML source ? Including the rules for using entities such as All my HTML books seem to be really vague about this subject, and I am probably stupid 'cause I don't see how to get this info from the DTDs.
Received on Thursday, 20 January 2000 10:54:30 UTC