- From: Daniel W. Connolly <connolly@hal.com>
- Date: Mon, 13 Jun 1994 13:29:55 -0500
- To: murray@sco.com
- Cc: Multiple recipients of list <www-html@www0.cern.ch>
In message <9406131309.aa06913@dali.scocan.sco.COM>, Murray Maloney writes: > >However, now it seems that Tim thinks that it >will be possible for a document instance >to "encounter a RENDER tag for an undeclared element". > >It seems that things are not so clear again. >At least to me? > >From what I gather, you had a reasonable picture of how it might work in mind, and whoever thought a conforming document could contain tags that refer to undeclared elements was a little confused... [note that undeclared entities are a different story... the current HTML DTD has a #DEFAULT entity declaration... more on that later] >So, what is the story going to be? I think that >we have to decide and commit right now. Either >we are going to define HTML 2.0 and 3.0 as strictly >conforming SGML DTDs and not provide trivial mechanisms >for extending the language at the whim of information >providers or browser developers, OR we are going to use >SGML as a language of convenience for defining HTML 2.0 >and 3.0 and then provide simple but effective ways to >formalize a mechanism for the extension of the language. At this point in the game, it's important to phrase these things carefully -- I've never seen the term "strictly conforming SGML DTD" before. The term "conforming SGML document", on the other hand, is defined in ISO 8879, definition 4.51. I suggest (for the Nth time... :-) that a requirement of the HTML language is: An HTML document shall be a conforming SGML document. This does _NOT_ directly conflict with the ability to "provide trivial mechanisms for extending the language at the whim of information providers or browser developers." For the purposes of the 2.0 spec, there will be no way to use tags that are not in the standard DTD in conforming documents. There just aren't any widely deployed mechanisms in place. Browser implementors will simply be warned that it is quite common for servers to transmit invalid documents, and certain classes of errors should be tolerated in the interest of short-term interoperability with experimental systems. But for future specifications, it is perfectly reasonable (and perhaps inevitable) to include "hooks" in the form of parameter entities like %cextra in the HTML DTD that allow information providers to extend the language on a per-document basis. And this does _NOT_ necessarily imply full DTD parsing in every client. A browser could, for example, support a constrained subset of declarations like: <!DOCTYPE HTML [ <!ENTITY % html PUBLIC "-//W3O//DTD WWW HTML 2.0//EN"> <!ENTITY % cextra "|quark|lepton"> %html; ]> ...<quark>...</quark>... Even with these hooks, we only provide limited extensibility. There may be a need for folks to experiment with idioms that are completely irreconcilable with the DTD. We can model this in any number of ways: -- The "ignore tags you don't recognize" convention. Experimental documents are an "invalid" documents, and the "unknown" tag names are markup errors, and could be reported to the user as such. This works ok for phrase-level markup, but not for elements that, for example, should cause paragraph breaks. Imagine a document that uses BLOCKQUOTE in a browser that doesn't support that element: the blockquotes would run into the neighboring paragraphs. If all we need is various phrase tags on a per-document basis, the %cextra hook will do just fine. -- Any document with experimental tags must include a prologue with declarations for those tags; i.e. if you want to mess around with experimental tags, you have to provide a corresponding DTD. We could support idioms such as: <!DOCTYPE HTML PUBLIC "-//experimentor//DTD WWW HTML//EN"> and a browser could look up the PUBLIC identifier in a table of supported (i.e. "precompiled") DTDs. This leavs open the question of: what do you do with this arbitrary document that you've parsed? How do you display it? How do you find the links? Do we adopt a stylesheet mechanism? Architectural forms? Both? -- Browsers could support arbitrary DTDs at runtime, and we could write: <!DOCTYPE FOO SYSTEM "http://myhost/mydtd"> and a browser could retrieve the DTD at runtime. At this point, we're talking about a beast that is clearly distinct from HTML. There are a lot more issues relates to "how do I express stuff that's not in the spec?" But for now, the answer is "you can't." Dan
Received on Monday, 13 June 1994 20:30:06 UTC