- From: Paul Prescod <papresco@itrc.uwaterloo.ca>
- Date: Wed, 20 Mar 1996 05:36:47 -0500 (EST)
- To: "C. M. Sperberg-McQueen" <cmsmcq@uic.edu>
- Cc: www-html@w3.org
On Tue, 19 Mar 1996, C. M. Sperberg-McQueen wrote: > I thought I made it clear that expansion of the entity reference would > be handled by the server, not by the client. In an ideal world, it > might be nice to have it be done sometimes by the server, sometimes > by the client. But that seems hard to work into http now. I'll reinforce a point someone else pointed out. In the woorld of HTML, many documents are being served by ancient servers. Perhaps if we require server-side entity references that situation would change, but history shows us that the browser side advances much more quickly than the server side. If we could get people to change their servers...geez, we could revolutionize the whole damn web. =) Anyhow, there is nothing that procludes you from developing a server that expands text entities appropriately. The question that started the thread was about client-side mechanisms. > But then, if we do ship the DTD around, what happens? Browsers which > don't know what to do with it may not do the right thing with it. > > This sounds rather similar to what happens if we use an <INSERT> > element or any other element: browsers which don't know what to do > with them may not do the right thing with them. Theoretically this is the case. But <INSERT> has already got vendor support (though I don't know if any <INSERT> enabled browsers are shipping) and I am convinced that part of the reason for that support is because <INSERT> is basically son-of-<IMG>. > SGML entities are by no means restricted to SGML content. (See example > just given.) They can as easily contain images or video. This does not > seem to be a reason to prefer one notation over the other. I agree with you that ENTITY's are flexible enough for multimedia content (I certainly use them for those). But we seem to be combining discussion of three different kinds of entities, and I think we should discuss them separately. The first includes inclusion of arbitrary HTML markup inside another document. My personal feeling is that if we can put off this discussion for a while, vendors will be closer to implementing full SGML-smart browsers, servers and editors and many of today's problems around them will go away. That's why _my_ answer to the question that started this thread was "EMBED HTML as you would any other data type." All they wanted was a toolbar, after all. SGML text entities would be a big sledghammer for a small fly. The second kind of entity includes an arbitrary "other object". HTML did this through IMG first, and now through EMBED. Unfortunately, HTML's inclusion paradigm is now a few years old, and there is substantial author and tool support for it. Furthermore (this is where I get heretical), I don't know if HTML authors have anything to gain by moving to the SGML paradigm this late in the game. When browsers are "entity smart", HTML authors will have the _option_ of using entities for object inclusions, or using the short-and-sweet URL without a redirection through an entity. Most of the HTML authors I know simply would not understand why they should scroll to the top of the document to include an entity reference. And "HotDog", "NotePad", "WebEdit" and "SimpleText" are not going to help them. Since so many links and embeds are "one shot", and there are already several levels of redirection available, many authors will wonder why they need another. The third kind of entity we are discussing is a SUBDOCument entity, which I think is just a special case of "multimedia object." I don't know why we should treat it any different. > Yes. That might be a reason to prefer client-side expansion of > entity references. On the other hand, any method one chooses of > organizing data is apt to pessimize some caching scheme or other, > under the right circumstances. If the client does the expansion > of the entity reference or INSERT, and the material inserted changes > frequently, the copy in the cache is apt to be out of date. > My copy of Netscape does not detect this: it just happily shows > me the outdated cached copy of changed documents until I force a > reload, manually. Either your Netscape is broken or your server is broken. This is clearly not how HTTP is supposed to work. Yes, I have observed this behaviour too. > But either way, this is an argument for doing > expansion on the server or the client side, not an argument for > inventing a new notation for existing SGML functionality. > Or am I wrong? You are right. But from a browser vendor's point of view, the EMBED notation is not very new. It is just a cut and paste of IMG/EMBED/APPLET/FIG code with extensions. > Not necessarily: a server does not have to be fully SGML compliant (or > even fully SGML aware) to recognize and act appropriately on entity > declarations and entity references. Is that really the case? Aren't there RE/RS issues? Recursive entity replacement issues? Element recognition issues? Entity recognition issues? Can entity replacement and element recognition can be done by two separate tools without any knowledge of each other? I'm really asking...I always use an SGML smart editor and parser before working with SGML documents. A newline in the wrong place can be too much of a headache. I ordered the SGML handbook a dog's age ago and am still waiting... > (Although full SGML support would be a damn good thing for the Web: for > further discussion, see the paper Bob Goldstein and I wrote, at > http://www.uic.edu/~cmsmcq/htmlmax.html.) I don't think anyone here will argue with that. It's the unwashed hordes outside that we have to try to convince. =) > > The second is that HTML authors do not like "naming" things that > > already have names. In other words, they do not like giving an SGML > > entity name for something that already has a URL. Part of the > > difference between the HTML community and other SGML DTD user grous > > is that most HTML authors do not use "smart" authoring tools, and > > SGML-smart authoring tools are especially rare. > > This seems to me rather a large generalization, but even taken at face > value I'm not sure it's an argument for reinventing yet another wheel. It's only a reinvention from our point of view. From a typical HTML author's point of view, SGML entities are a reinvention. They understand IMG, and they understand INSERT as IMG-on-steroids. I think we lost this battle years ago. Yes, if we could go back and change IMG, I would do it. > Unless, of course, you count the time it takes to persuade people that > (a) the wheel exists, (b) it meets the specifications, and (c) it > doesn't have to be rejected just because it was invented somewhere else. _EXACTLY_! And if you think there is resistance to the idea _here_, wait until you take it to the hundreds of thousands of SGML authors who have never heard of SGML and who are absolutely resistant to even keeping _formatting_ at the top of their document, much less content information. SGML entities are part of HTML as an SGML application. They have been since HTML 2.0 at least. Nobody has implemented support for them. Meanwhile 4 or 5 different object embedding tags have come into existance. INSERT is a consolidation of them. It does not preclude or even duplicate the behaviour of SGML text entities. It does preclude non-SGML entities, but could be extended to support them, I suppose. Paul Prescod
Received on Wednesday, 20 March 1996 05:36:52 UTC