Re: Adding new character entities (was Re: [XHTML2] Unicode line and paragraph separators)

Ernest Cline wrote to <mailto:www-html@w3.org> on 7 April 2003 in "Re:
Adding new character entities (was Re: [XHTML2] Unicode line and  paragraph
separators)" (<mid:3E915D10.7424.5BE94B@localhost>):

> Any idea on when a Universal enity set for XML might surface and who
> might need to be prodded/offered help to get work on it further along?

I believe that it already exists, thanks to the SPREAD initiative. Although
written for SGML, the SPREAD entity declarations should work for XML. The
names are not mnemonic: &U2014; is as memorable and attractive as &#x2014;.
The advantage of entities over character references is that redeclarations
can adapt for limited environments:

<!ENTITY U2014 "---" >
<!-- Em dash is not available. -->

> (By the way, what do you mean by NCR? I'm assuming numerical entities
> such as &#8233; and &#x2029; but it isn't clear from the page.)

If I may be so bold as to speak for another, "NCR" stands for "numerical
character reference".

Warning: pedantry lies ahead.
The constructs &#8233; and &#x2029; are character references [CR], not
entities. The constructs &oelig; and &rsquo; are entity references [ER], not
entities. Entities [EN] are the strings of characters that contain markup,
character data, or both, intended for inclusion into a document by entity
references.

[CR]
Production 66, "CharRef", in "Extensible Markup Language (XML) 1.0 (Second
Edition)".
W3C Recommendation.
6 October 2000.
<http://www.w3.org/TR/REC-xml#NT-CharRef>.

[ER]
Production 68, "EntityRef", in "Extensible Markup Language (XML) 1.0 (Second
Edition)".
W3C Recommendation.
6 October 2000.
<http://www.w3.org/TR/REC-xml#NT-EntityRef>.

[EN]
Section 4.3, "Parsed Entities", in "Extensible Markup Language (XML) 1.0
(Second Edition)".
W3C Recommendation.
6 October 2000.
<http://www.w3.org/TR/REC-xml#TextEntities>.

Received on Tuesday, 8 April 2003 04:34:18 UTC