Re: a question about entities and special characters

On Tue, 8 Aug 2000 keshlam@us.ibm.com wrote:

> >I'm German, so I use a lot of special characters like ä in my
> >pages. Are all these characters single nodes in the DOM?
> 
> This depends on whether you've asked your parser to retain Entity Reference
> nodes or not.

And whether the document being represented is in fact XML or HTML.
HTML DOMs don't even need to expose objects implementing either Entity
or EntityReference.  Further, it is explicitly stated that in HTML
all attribute values are simple strings with no EntityReference nodes.
(The same is not said for the content of elements, however, and perhaps
should be.)  There is also no way to create a new EntityReference node in
an HTML document.

All these lines of evidence converge to imply that entity references in
HTML documents should surface as single characters in Text nodes, just
like the equivalent character references.  I suggest that an editorial-level
correction be made to DOM Level 1 to say so.

-- 
John Cowan                                   cowan@ccil.org
C'est la` pourtant que se livre le sens du dire, de ce que, s'y conjuguant
le nyania qui bruit des sexes en compagnie, il supplee a ce qu'entre eux,
de rapport nyait pas.               -- Jacques Lacan, "L'Etourdit"

Received on Tuesday, 8 August 2000 09:00:06 UTC