- From: Ernest Cline <ernestcline@mindspring.com>
- Date: Mon, 07 Apr 2003 00:24:40 -0400
- To: "William F. Hammond" <hammond@math.albany.edu>
- CC: www-html <www-html@w3.org>
On 6 Apr 2003 at 22:09, William wrote: > I think the idea of using "&ps;" instead of <p> is bad. > > Aside from the arguments against already given, I wish to point > out another: no name for 
 can be set aside in the XHTML > spec for use on the web. (A name would be OK for inhouse use > of XHTML.) > > As I understand things, the XML 1.0 (2nd Edition) spec, > http://www.w3.org/TR/2000/REC-xml-20001006 , > at section 2.4 provides 5 named character entities: "amp", "lt", "gt", > "apos", and "quot". > > In order for other character entities in an XML document to be > referenced by name rather than by code point, the entity name must be > defined in the document type definition of the corresponding XML > application. > > Since section 4.4, XML Processor Treatment of Entities and References, > states that a non-validating processor (such as a browser) is not > required to retrieve an external entity, the use of a named character > entity such as "&ps;" is ruled out for XHTML since XHTML browsers are > not validating processors unless browsers are "required" to have > "canned" knowledge of it. > > I suppose the specification of XHTML 2 could try to insist that > browsers must know something like "&ps;", but I hasten to point out > that there is already some contention among major browser sponsors on > whether a browser must know any of the root namespace vocabulary of > XHTML, i.e., whether XHTML among XML document types deserves special > treatment by browsers. Since XHTML1 has the Latin-1, Special and Symbol characters as sets of defined entities that are part of the normative definition, I see no problem in adding &ps; and &ls; to the set of special character entities for XHTML2 if a decision were made to add them to XHTML2. Can you name a single browser that supports XHTML1 as application/xhtml+xml that does not support the three entity sets? Any application that tries to support XHTML as anything more than generic XML is going to have to understand predefined entities that are part of the normative definition, either by being a validating agent or by having an internal list of them. Any special treatment of U+2029 would only occur when a browser renders a document, at which time it would need the same degree of internal knowledge of XHTML2 to render the document whether or not paragraph boundaries are indicated by markup or by formatting characters. If the application isn't trying to render the document, then the default XML behavior suggested for U+2028 and U+2029 suggested by Unicode Techical Report #20 (which by the way, does not have the force of a full-fledged standard either for Unicode or XML) of treating separator charactors as whitespace is I believe an adequate interprepation for most non- rendering purposes. One might argue that an agent that doesn't know that &ps; should be replaced by U+2029 wouldn't know that &ps; is white space, but the same problem applies to the existing entities ,  ,  ,  , and ‌. Unless you wish to argue that all entities except for those defined in XML should be removed from XHTML2, I cannot agree with the argument you have given. Obviously if such a decision e were to be made then there would be no reasonable alternative except to retain <l> and <p> and forget about separators entirely. (Not because 
 is too long, but because it does not make sense to have use a non-named entity put to such use.)
Received on Monday, 7 April 2003 00:24:25 UTC