W3C home > Mailing lists > Public > www-html@w3.org > April 2003

Re: [XHTML2] Unicode line and paragraph separators

From: Simon Jessey <simon@jessey.net>
Date: Sun, 6 Apr 2003 18:11:50 -0400
Message-ID: <000601c2fc89$85f091d0$6401a8c0@Simon2S0JP11>
To: "www-html" <www-html@w3.org>

----- Original Message -----
From: "Ernest Cline" <ernestcline@mindspring.com>
Subject: Re: [XHTML2] Unicode line and paragraph separators


> Oh? And how do most end users know that the content of a <p> is a
> paragraph? By the way that the user agent applies a style to all of
> the <p> elements in a document so that it looks like a paragraph.


But with XHTML 2.0, we are not actually concerned with what the end user
sees. We are concerned by how the document is structuted.


> Because there exists another way to code paragraph breaks in the
> default character set of XML, Unicode, that does not incur the overhead
> associated with using markup, namely the paragraph separator U+2029.


Why do we need another way of marking up a paragraph when a perfectly good
one already exists? Furthermore, why should that new way differ from the
usual method of marking up an element - by wrapping it in start and end
tags?


> Every element that is kept should have to demonstrate why it should be
kept.


I agree with that.


> Since the presentational aspects of <p> can be obtained thru the use of
> the paragraph separator and appropriate CSS, <p> should only be
> retained if the other uses, such as styling or scripting on a specific
> paragraph, are used often enough that it is worth retaining.


We have no idea what kind of user agent may be employed to 'read' our
documents. It may be as visual browser, it may be an audio browser, it may
also be some kind of search bot or similar tool. A paragraph is a BLOCK of
text that includes one or more sentences and encompasses a concept, or it
may be being used to indicate different speakers. It is just as fundamental
as a sentence. In fact a sentence on its own would normally be considered a
paragraph in any case. Therefore the paragraph should be considered a
structural construct. The default formatting given to it by a UA isn't
relevant to its existence.


> Sentences and words are structural elements, yet they do not have
> markup associated with them.  There are two reasons why that is the
> case.  First, unlike <p>, no markup was necessary in HTML to be able to
> get the desired presentational effects.  Second, authors are rarely
> concerned with providing non-presentational effects on sentences and
> words.


Actually, I think you have missed the most important reason. Sentence and
word structure within paragraphs can be radically different from language to
language, yet exist in almost all of them.

One thing occurs to me. If you are suggesting we ignore the structural
significance of paragraphs and treat them simply as separated chunks of
text, is that not reducing them to something similar to an unordered list?
Perhaps paragraphs should be marked up as lists instead.

Simon Jessey

w: http://jessey.net/blog/
e: simon@jessey.net
Received on Sunday, 6 April 2003 18:11:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:55 GMT