- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Sun, 18 May 2003 14:37:22 +0300
- To: www-html@w3.org
On Friday, May 16, 2003, at 20:04 Europe/Helsinki, John Lewis wrote: > Henri wrote on Friday, May 16, 2003 at 9:41:05 AM: > >> Not supporting character entities in XHTML is not a bug in Opera if >> 1) XHTML is an application of XML >> and >> 2) XHTML user agents aren't required to use validating XML >> processors. > > Actually, Opera Software internally decides what's a bug in Opera and > what's not. Whether or not they technically need to do something is > irrelevant; although I'm sure it's a consideration, there are > practical concerns as well. I guess that depends on the definition of "bug". Your approach would make it very easy to publish bug-free software. :-) The concerns here that may seem practical to document authors are very impractical from the implementor point of view. >> Do you classify TextEdit (bundled with Mac OS X) and WordPad >> (bundled with Windows XP) as "special advanced authoring tools"? > > I've never used TextEdit. I'd hesitate to even call WordPad an > authoring tool, but perhaps the Windows XP version is different. > > People make use of named entities daily, and you've not provided any > evidence that they're harmful; I have said this a couple of times already, but I try to make it clearer this time. 1) In XML, except for lt, gt, amp, apos and quot, character entities have to be declared in the formal part of the DTD in order to be referencable in the places where character data may occur. 2) Document authors would find inconvenient to paste the character entity declaration in the internal DTD subset of each document, so I've implicitly assumed that the character entities would have to be declared in the external DTD subset. 3) The XML spec defines the concepts of well-formedness, standalone document and non-validating XML processor in order to accommodate the needs of interactive document browsers and applications that are used in a network context. 4) Given 3) it would be harmful to make requirements elsewhere that would force Web browsers not to use non-validating XML processors. (Also, it would be harmful to make it impractical to use standalone documents.) 5) Non-validating XML processors are not required to process external entities. That is, they are not required to process the external DTD subset. 6) Processing external entities of the size of typical W3C DTDs along with the document entity would incur runtime performance penalties compared to only processing a tag soup document entity or an XML document entity. 7) Given 6) processing external entities is undesirable in interactive applications even though non-validating XML processors are allowed to process external entities. Hence, interactive user agents should not be expected to process external entities and, therefore, character entities declared in the external DTD subset should not be expected to be available. Wiggling out of point 1) would mean descending into tag soupness by violating the rules of the language framework. Wiggling out of point 6) would mean either pretending to process a W3C DTD while actually processing something else (this is what Mozilla does but doing so is dirty and has the danger of compelling others to follow) or caching the data structures that get built when the DTD is parsed (which would be unduly difficult because the declarations made in the external subset may change depending on parameter entities in the internal subset). > The original > poster suggested they be made optional, which is something I'm okay > with since market forces will compel the major UAs to support them > anyway. They are optional already in XHTML 1 given point 5) above. I think avoiding the related issues of external entity processing by using dismissal as "optional" is bad in the Web context, because either some browsers don't support character entities (in which case they could just as well not exist and a lot of confusion could be avoided) or all browsers are forced to support them (which would have ugly implications [see above]). > As a practical matter, the major UAs already support HTML, > which means they support the named entities already, which means > there's little reason to not support them for XHTML (unless they're > removed). That reasoning is flawed. Character entity support for HTML is implemented in tag soup processors--not in XML processors. Tag soup processors by their nature don't play by the rules of SGML or XML. Entity support in HTML parsed as tag soup and in XHTML parsed as real XML have no implementation connection. -- Henri Sivonen hsivonen@iki.fi http://www.iki.fi/hsivonen/
Received on Sunday, 18 May 2003 07:37:30 UTC