- From: Murray Altheim <murray@spyglass.com>
- Date: Wed, 20 Mar 1996 01:03:17 -0400
- To: Abigail <abigail@tungsten.gn.iaf.nl>
- Cc: www-html@w3.org
>Murray Altheim wrote: >++ This would really be the big change: not using HTML as the base language of >++ the Web. We'd use SGML (MIME type "text/sgml; level=1|2|3|4"), allowing the >++ DOCTYPE of the document to determine the DTD, just as in SGML. That DOCTYPE >++ could simply specify a dialect of HTML for the current majority of web >++ documents. > >I have heard this many times, yet I see problems noone has given >me an answer to. HTML certainly is more than just a grammer. >Search engines can index a document properly _because_ there >is an implicit meaning to <TITLE>, that <H1> is more important >than <H6>, that <STRONG> is used for something else than <B>, etc. You may be making a sizeable assumption about the intelligence of search engines. Beyond TITLE and HEAD information, body content is pretty much indexed as full text. OpenText (as also Yahoo) advertises this as a feature. Few engines are geared for keyword in META, etc. and I seriously doubt that there's much index differentiation between content found in formal META elements and in body content. But I don't disagree with the point of element utility. >But in the DTD, <H1> and <H6> have interchangeable roles; ><STRONG> and <B> have the same context and the same content; ><TITLE> is just something which appears in the <HEAD>. > ><A>, <IMG>, <INPUT> have side effects which aren't set in the DTD. As you correctly note, the full specification of HTML (or any SGML application) as a language or of a conforming user agent goes beyond what is contained in a DTD. The HTML DTD is simply the formal definition of the HTML syntax. An SGML application is not specified only by a DTD. Combine the formal specification of the abstract syntax, character set, quantities, etc. found in the SGML declaration, the application-specific syntax of the language specification (DTD), the specified application conventions ("H1 is bigger than H6"), and the element formatting information (whether hardwired into a browser or handled externally via a stylesheet) and you have an SGML application. The application conventions for the core of HTML are widely known and deployed. >If each document comes with its own DTD, then what? A user agent >knows how to parse it, but how should it be displayed? Of course, >authors could be required to deliver a style sheet as well, but >they have to include everything, as there cannot be user agent >defaults to fall back on. And what about user preferences? How >is a user supposed to set preferences, if each document can have >unknown elements? I fail to see where this is really a problem. If a document author wished to specify a presentation style, they'd simply specify it in a stylesheet. If the element was designed for markup that didn't need presentational differentiation from body text, no style specification would be needed. We are currently limited by the "hardwired stylesheets" of popular browsers. Had a simple default stylesheet been the chosen method for specifying element presentation in the first version of Mosaic (rather than hardwiring it in code), things might look very different today. Expanding upon what is already conventional within HTML using a stylesheet is quite simple. For example, there is no AUTHOR element in HTML 2.0. If a browser came upon "Gee, Frank, that <AUTHOR>Bill Gates</AUTHOR> sure is a swell writer!" the user agent/browser would simply ignore the AUTHOR tags. If the document author wished to somehow differentiate the element, style information could be included, such as (using a stylesheet syntax): AUTHOR { font-style: italic color : purple } Concerns over TITLE, ISINDEX, etc. are met with the application conventions of whatever SGML application we're dealing with. I'm not of the mind that there will be a dozen different SGML applications floating out on the web, unrelated to HTML. There will probably be very few, with a multitude of variants upon a central referent. That referent application will be some evolutionary descendent of current day HTML, and will inherit many of the application conventions of HTML. If a particular community breaks off and starts using an entirely different SGML application unrelated to HTML (say the chemical industry), they will probably be using browsers designed to deal with the application conventions and needs of that community, possibly using a combination of custom applications, generic application with stylesheets and/or plug-ins. Murray ``````````````````````````````````````````````````````````````````````````````` Murray Altheim, Program Manager Spyglass, Inc., Cambridge, Massachusetts email: <mailto:murray@spyglass.com> http: <http://www.stonehand.com/murray/murray.htm>
Received on Wednesday, 20 March 1996 01:09:17 UTC