W3C home > Mailing lists > Public > www-html@w3.org > April 2003

Re: [XHTML2] Unicode line and paragraph separators

From: Etan Wexler <ewexler@stickdog.com>
Date: Tue, 08 Apr 2003 01:31:53 -0700
To: Ernest Cline <ernestcline@mindspring.com>, www-html <www-html@w3.org>
Message-id: <BAB7BB99.D0A%ewexler@stickdog.com>

Ernest Cline wrote to <mailto:www-html@w3.org> on 6 April 2003 in "Re:
[XHTML2] Unicode line and paragraph separators"
(<mid:3E9019D5.12974.294181B@localhost>):

> Despite the wishes of coding purists, current (X)HTML is largely
> written on the basis of its presentation and not its structure.

I don't know if this holds for XHTML, but I grant your assumption without
hesitation as far as HTML goes. The point, though, is that it doesn't matter
much how people abuse markup constructs. What matters is how people can use
the constructs correctly. If we eliminate the 'p' element type, we may be
able to cut down on markup used for presentation. (I'm not sure that it
will.) What we will certainly accomplish is the elimination of the
possibility of semantic paragraph markup.

> Because there exists another way to code paragraph breaks in the
> default character set of XML, Unicode, that does not incur the overhead
> associated with using markup, namely the paragraph separator U+2029.

The markup associated with a 'p' element is hardly a burden. Its benefits in
terms of clear semantics, uniformity of syntax, addressability, and
attribution are far stronger than the overhead.

If we were still dealing (rather, if we were now dealing) with SGML, I would
support the declaration of paragraph separator as a short reference string
for a 'p' start-tag. Alas, we lose some flexibility in the move to XML.
 
> HTML is a presentational
> language with some structural elements added because of how they
> affected the presentation.

I would agree that the media type text/html basically means "presentational
junk markup". I cannot agree that HTML 2.0 or HTML 4 document types are
basically presentational in nature. They are basically semantic and
structural, with a side order of presentational constructs as a sop to
prevailing trends.

> Sentences and words are structural elements, yet they do not have
> markup associated with them.

That is to the detriment of the Web.

> There are two reasons why that is the
> case.  First, unlike <p>, no markup was necessary in HTML to be able to
> get the desired presentational effects.

I agree.

> Second, authors are rarely
> concerned with providing non-presentational effects on sentences and
> words.  When they are, markup such as <span>, <em> or <dfn> can be
> used.

Those element types are no substitute for unambiguous sentence and word
markup.

-- 
Etan Wexler <mailto:ewexler@stickdog.com>.
<del>Bruce Springsteen</del>
<ins>Operation Makeout</ins>
Received on Tuesday, 8 April 2003 04:34:17 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:55 GMT