Re: character entities

Dear David,

> 6.2.1 lists some non marking entities, and introduces them with the
> phrase
> 
>    Some character entities, although important for the quality of print
>    rendering do not directly have glyph marks that correspond. 
> 
> But are they all really _character entities_ ie things that have
> definitions such that they can be used in elements with PCDATA
> content?
>
>  ⁢
> 
> For a style sheet I can define that to be the empty string
> and it can be used in mo as <mo>&InvisibleTimes;</mo>
> 
> but are the spacing entities really intended to be used inside
> such token elements, or outside.
> 
> I am tempted to define 
> 
> &MediumSpace;  to be <mspace width="something">
> 
> but then &MediumSpace; could only be used where mspace could
> be used (ie not in token elements).
> 
> In that case I have the alternative of defining &MediumSpace;
> to be the unicode character 2005 but then &MediumSpace; can _only_
> be used for spacing in token elements, and not the place one normally
> needs to adjust the spacing, ie between elements.
> 
> The problem is more acute for things like 
> 
> &NegativeMediumSpace;
> 
> where as far as I can see there is no unicode equivalent so I don't
> think I can map it to any character, so it appears the only sensible
> alternative is to map it to <mspace width="-something"> but then,
> it could not be used in  character elements, which I think is the usage
> implied by the recommendation?
> 
> Is it really the intention that these spacing commands are used as
> characters rather than <mspace> elements?

We had quite a few hot discussions about these issues in the working group. I
did not really agree with the decision to make most of these things "character
entities" for most of the reasons you mentionned. Most of the time there is
no "natural" Unicode code associated and besides, their occurences outside
"PCDATA" is forbidden but sometimes tempting...

What I remembered from our discussions and the arguments of the people that
were in favor of these things is:

- the character entities can reference either existing Unicode characters
(such as one of the various Unicode "space character" that may be
appropriate), or reference a character in the Unicode private zone when there
is no "natural" equivalent. So your "InvisibleTime" may really be some
private MathML character here or the Unicode "Zero Width Space" (if this is
permitted). The "&NegativeMediumSpace;" definitely references a character in
the Unicode private zone. 
- When outside PCDATA, you should embed them in some element that can
legally occur there or use equivalent elements (such as <mspace>).

I have to admit that a "cleaner" design would have been a little more
complicated.


           Stéphane.

Received on Friday, 28 August 1998 02:37:47 UTC