W3C home > Mailing lists > Public > www-html@w3.org > April 2003

[XHTML2] Unicode line and paragraph separators

From: Ernest Cline <ernestcline@mindspring.com>
Date: Thu, 03 Apr 2003 14:22:37 -0500
To: www-html@w3.org
CC: www-html-editor@w3.org, www-style@w3.org
Message-ID: <3E8C43AD.23945.10681ED@localhost>

I can see both pluses and minuses to this but how about using the 
Unicode characters U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR 
either instead of or in addition to the <l> and <p> elements?

What are the pluses?  First of all, such usage could be more compact 
than <p></p> or <l></l> where no attributes are attached, since they 
could be single unicode characters or at worst the decimal entities 
&#8232; and &#8233; respectively. Also both <p> and <l> have to go 
through convolutions in their formal grammar not required of other text 
elements because of the requirement that they not include instances of 
themselves. Replacing <p> and <l> with the seperator characters would 
allow for the grammar to be considerably simplified.

What are the minuses? In those cases where it is desired to have 
attributes attached to a single paragraph or line, or to be able to 
refer to them as a child element for use with CSS or DOM, using <div> 
or <span> would require a a less compact representattion than <p> and 
<l>. Also, to adequately represent paragraph formatting would require 
changes to CSS. At a minimum, some way of managing spacing between 
paragraphs would be needed and a clarification to text-first-indent 
that it should apply to the first line of each paragraph in a block of 
text AND to each line that follows a paragraph separator character. 
(Such changes are why I sent a CC to the www-style list, since CSS is 
in theory not supposed to be only for HTML, such additions would 
probably be a good idea to support styling documents that use the 
paragraph seperator to mark paragraph boundaries, even if the decision 
is made to not make use of them in XHTML.)

If these two characters are adopted, (either as supplements to or as 
replacements for <l> and <p>) I suggest that entity names be set aside 
for them in XHTML 2, perhaps &ls; and &ps;, if they are not used for 
some other purpose in some other standard. (As far as I can tell, those 
two names are not used in either XML or HTML, but they might be used in 
some SGML or XML derived document type that I am not aware of and which 
has reasonably wide usage.)
Received on Thursday, 3 April 2003 14:22:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:54 GMT