- From: William F. Hammond <hammond@csc.albany.edu>
- Date: Wed, 27 Jun 2001 12:43:22 -0400 (EDT)
- To: www-talk@w3.org
(Responding to Arjun Ray and Ian Hixie) I think that the 3 of us agree on most of this. In particular, I do not see serious issues about markup here. But I believe that each of them sees in opposite ways no overlap between text/html and XML markup although the W3C XHTML spec does, RFC 2854 does, and W3C's Amaya does. Of course, it is correct that there is no overlap between XHTML and tag soup, where I understand "tag soup" to mean HTML without a document type declaration. And I perceive classical mass market user agents as ignorers of document type declarations even if they are provided, so that these agents always have been tag soup handlers. My understanding is that the overlap is provided for the purpose of making it possible for content providers to bring up XHTML documents that are usable -- at least to some extent -- in old user agents. That means necessarily that they must be handled by old non-rigorous user agents as tag soup. There is no suggestion in any spec I know that a given FPI should have more than one method of construal in an SGML or XML parser. (Arjun: were you seriously suggesting that a given FPI used in a document type declaration to refer to an external document type definition, with, for the sake of discussion, no internal subset, can be used with more than one SGML declaration?) Whatever notion of "compatibility" arises from the informative Appendix C of the XHTML spec is simply in the observation that there are ways to prepare a sophisticated XHTML document -- for example under the Carlisle/Altheim "XHTML 1.1 plus MathML 2.0" document type -- so that old mass market user agents can parse it as tag soup. What I perceive as the two most widely distributed mass market user agents are behaving in diametrically opposite ways on the mime type issue. This is a big problem for content providers. If the web is to move beyond tag soup in a smooth way, I think it clear that text/html should be the primary content-type for all XML document types that are extensions of the historic HTML language and that have been prepared to degrade to tag soup. This is necessary to enable content providers to make a smooth orderly transtion. It would be outrageous for a new XHTML-capable user agent to deny content providers the reward for this effort during the rather long time that the content providers are concerned about reaching readers with old user agents. This is simply NOT the W3C-specified model, and it is not the behavior of W3C's Amaya. The writers of XHTML capable user agents need to understand the not very complicated subtleties of document prolog construction that arise with XHTML in order to be able to smoothly service old and new. This is a not run time performance hit. Notice how smoothly Amaya navigates from tag soup to Carlisle/Altheim documents. If Amaya can do it, then the big guys can do it, too. -- Bill ----------------------------------------------------------------------- Responses on finer points follow below. Arjun wrote: > Ian and I just went over the conformance requirements, with less than > happy conclusions. Do you disagree with them? I agree with Arjun, to the extent that I've looked. > The difference has to do with basic syntax. LINK elements have EMPTY > declared content, and are subelements of HEAD which does not allow > mixed content. Thus, the form <LINK> will not validate as XML, and > the form <LINK/> will not validate as RCS SGML. [ Actually: <link/>. :-) ] Yes, a given instance cannot be both valid HTML 4.01 and valid xml of any kind. But the compatibility assertion is that "<link ... />" or "<link ... />" (with a positive amount whitespace) degrades as tag soup in old user agents. Example: the root URI at W3C. > I'm sorry, I don't understand this. I know of no ratified notion of > "correct validation" which predicates (the contents of) an SGML > declaration on (the contents of) a document type declaration. I Realistically, a human never sees a fully assembled HTML 4.01 document. For example with SP one uses a catalog. The catalog may be specified as an argument to SP. Each catalog points to an SGML declaration. Each FPI in the catalog points to a system identifier for the document type declaration (external subset only). Ian wrote in reply to Arjun: > > The idea that non-geeks should respect geeky niceties is Canutism at > > its worst. "Zero tolerance" is one thing if end-users can be made to > > expect it; it's another when precisely the opposite is the > > expectation being sold to the public. > > I would tend to agree with this. I don't think we (the W3C and its > community) should be bothering to promote "compatability" of XHTML and > Tag Soup. Here is how I think it should work: XHTML and tag soup are very different. The point, however, is that there is an easy way for most XHTML, strictly conforming or not, to be prepared so that it qualifies both as XHTML in a new user agent and as tag soup in an old user agent. That is the essence of the advertised compatibility. Nobody ever said that tag soup would get non-failing treatment in a new XHTML user agent. Check out Amaya, which yells about XHTML but not about tag soup. For this purpose an example of classical tag soup might be Friday's html version of the "Scout Report", which does have a document type declaration but has some validation issues. Amaya will not yell about it, but Amaya will yell about problems in XHTML, regardless of mime type. > 4. Document authors use XHTML (text/xml). ... > Step 4 is in the future. Step 4 is realized by Amaya, which handles XHTML either as text/html or as text/xml though there is no justification in any XHTML-related specification for the serving of XHTML as text/xml. Still it would appear to be justified by RFC 3023. Why not bring Mozilla up to speed? > I fail to understand the point of that. It's a service for content providers. It makes it possible for a huge web of documents to be moved slowly from the old world to the new world without having to worry about whether readers have old user agents or new user agents. The Mozilla 0.9.1 behavior forces content-providers either to keep dual archives or else, in serving their sleek new XHTML as text/html, to give up the benefit of new handling in Mozilla. Worse than that, if they do not keep dual archives and if they are not validating, they won't really know if it "works" until they're in deep trouble. Or they might end up thinking that someone else's new browser is doing a much better job. > All I see are many reason not to do it, the primary one being that it > will cause XHTML UAs to have to be backwards compatible with a > premature Step 4's supposedly-XHTML content which works in today's > browsers... otherwise known as Tag Soup. Welcome back to Step 1. No, the new user agent needs, like Amaya, to make a quick early decision about which way to go. As I've said before, the W3C HTML WG could give user agent writers a bit more help in deciding how to proceed here. -- Bill
Received on Wednesday, 27 June 2001 12:44:09 UTC