W3C home > Mailing lists > Public > public-i18n-geo@w3.org > July 2005

Re: New FAQ: entities and NCRs

From: Chris Lilley <chris@w3.org>
Date: Fri, 1 Jul 2005 22:05:24 +0200
Message-ID: <1379433989.20050701220524@w3.org>
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: "Richard Ishida" <ishida@w3.org>, "GEO" <public-i18n-geo@w3.org>

On Friday, July 1, 2005, 9:21:01 PM, Bjoern wrote:



BH> * Richard Ishida wrote:
>>http://www.w3.org/International/questions/qa-escapes.html

BH> I think the document should note that using "character entities" is
BH> not interoperable and possibly dangerous.

Yes.

BH> The HTML Working Group has
BH> been approached several times to clarify whether and how
BH> implementations are supposed to support the pre-defined entities if
BH> they do not read the external subset;

Which is optional, per the XML spec

BH>  the HTML Working Group so far
BH> refused to provide such clarification,

The XML spec seems fairly clear on that point

BH>  so there are a number of old
BH> implementations that do not support use of them in XHTML documents
BH> at all (to the extend that some implementations incorrectly reject
BH> documents that use them)

do they fetch the external DTD subset where these entities are defined,
or not? If not then they are undefined entities.

BH>  and current implementations that support
BH> them for some document types but not for others.

BH> Also note that per XML 1.0 Third Edition,

BH> It is [A violation of the rules of this specification] if an
BH> attribute value contains a reference to an entity for which no
BH> declaration has been read. [Conforming software MAY detect and
BH> report an error and MAY recover from [this error]].

All of which, as you say, means hat using them is non-interoperable.
NCRs, us just plain using the regular characters instead, is much
preferable. Entities in SGML date from a pre-Unicode time where assuming
anything beyond ASCII was non-interoperable; that is not the case today.


BH> So to the extend that it is possible to have some kind of XHTML document
BH> that uses "character entities" in attributes but the user agent does not
BH> support the document type and/or did not process the entity declaration,
BH> it is perfectly permissable for the user agent to act in unexpected ways
BH> for the document.

Yes.

BH> Robust documents do not use "character entities" at all unless they
BH> are pre-defined in XML 1.0 or declared in the internal subset.

Agreed.

BH> The document should also link to the relevant requirements in Charmod
BH> Fundamentals.




-- 
 Chris Lilley                    mailto:chris@w3.org
 Chair, W3C SVG Working Group
 W3C Graphics Activity Lead
Received on Friday, 1 July 2005 20:05:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:12:40 GMT