W3C home > Mailing lists > Public > public-appformats@w3.org > August 2006

RE: XHTML and MIME (was: IBM Position Statement on XForms and Web Forms 2.0)

From: T.V Raman <raman@google.com>
Date: Thu, 31 Aug 2006 16:17:00 -0700
Message-ID: <17655.28140.348319.1471@retriever.corp.google.com>
To: doug.schepers@vectoreal.com
Cc: public-appformats@w3.org, www-forms@w3.org, www-archive@w3.org


Doug, You make an interesting point. 

Personally, I believe that the decision to mandate that xml
content types be served as application/xml+xhtml was the key
mistake that happened circa 2000  --- I still distinctly remember
feeling very unhappy about this at the Amsterdam WWW conference.


Background/Run-Up To The Above:


At the time all the Web companies -- including browser vendors,
authoring tool vendors, organizations representing Web Developers
all came together in 1998 at the "Future Of HTML" Workshop 
http://www.w3.org/MarkUp/future/

Here is what the world looked like:

A)   The browser wars were all but over bar the shouting

B)   Everyone had suffered sufficient tag-soup pain to last
     several lifetimes
C)   Everyone in the Web community had suffered from the browsers
     *exclusively* dictating what Web pages worked --- I remember
     this as the "a new tag every Monday" phenomenon. No one
     could create, develop or deploy content reliably -- leave
     alone deploy tools to create, manage or deploy Web content.

So then, the above was the background at the time we all
collectively resolved to close off HTML4 development, and move to
a cleaner, well-formed world.
None of us   was naive enough to assume that the transition would
be easy. As an example, here is a link to the position paper I
wrote with my colleagues at the time -- this is now 8 years and
two jobs ago, but I still believe most that we wrote at the time:
http://www.w3.org/MarkUp/future/papers/adobe-19980427.html
(historic note: in that document PGML == SVG  -- PGML was one of
the submissions that launched SVG)


Now, if you trace the actual execution of some of what was
envisioned at the workshop:

0) We did execute step-0 correctly, namely, by coming up with an
   XML formulation of XHTML that provided a stepping stone for
   moving to XML -- Jan 2000

1) Next XHTML modularization and XHTML 1.1 have been extremely
   successful at bringing together multiple vocabularies.


2) If you look at the position paper cited above and others from
   that workshop, we all viewed CSS and DOM programming as the
   vehicle that would allow us to transition over from HTML4 to
   XHTML, where during the transition, we'd be able to support
   legacy content while progressively mapping that over to a newer
   content model. 

The reasoning was that at the time, the legacy tag-soup content
on the Web mostly relied on HTML3.x + vendor tag extensions and
bug-compatibility features --- and that newer technologies like
CSS  would help us paper over that pig.

But here is where things started falling apart. After 2000, with
 one browser all but owning the market, things stagnated. CSS
 never got fully implemented, and the divergence in scripting ---
 especially combined with lack of participation in the DOM WG
 leading to its eventual demise all meant that rather than
 executing on the kind of transition roadmap that was envisioned,
 the Web essentially stagnated with sites writing content for the
 dominant browser and the rest of the pack entering
 bug-compatibility mode.

Sadly, this happened right around the time the decision to place
newer XML-based content-types  under mime-type application/xml+*
happened, the rest is sufficiently recent that I dont need to
repeat it for most on this list.

Doug Schepers writes:
 > 
 > Hi-
 > 
 > Having attempted to dispassionately describe the positions of both sides (if
 > there are only two sides, I want to state my own opinion.
 > 
 > I fall into the latter camp (obviously), and hold that XML in all its
 > flavors (XHTML, SVG, XForms, etc.) is preferable to being chained to legacy
 > content based on a single transitory format (HTML).  I am not alone in
 > thinking that HTML is not a suitable base for going forward, unless it is
 > the XML characterization of XHTML.
 > 
 > Taking into account that content is overwhelmingly being generated by
 > authoring tools, once those tools are updated to create conforming content
 > (which I think they will, as a result of market pressures), then this brief
 > hiccup of forward momentum (stalled by the bursting bubble of Web1.0) will
 > quickly be forgotten.  Old content will still work, since it will be served
 > with the older MIME Type, but new content will move on.
 > 
 > If necessary, perhaps an errata to XHTML, or a different approach to
 > identifying XHTML, could resolve this on the level of technicalities.
 > Modern browsers are evolving rapidly, and surely they can be adapted to best
 > serve the future.
 > 
 > Regards-
 > Doug
 > 
 > Doug Schepers wrote:
 > | 
 > | Hi-
 > |  
 > | There seems to be a major divide between people who believe that XHTML
 > | cannot or should not be used on the Web (largely because IE 
 > | does not yet
 > | understand it), and those that believe it can and should.
 > | 
 > | It is this fundamental schism that must be resolved before we 
 > | can all reach
 > | a solution to this current debate that we are happy with.  
 > | I'm going to
 > | outline the debate as I see it, but if I have missed the 
 > | subtleties of some
 > | argument, it is through an error and not malice.
 > | 
 > | Ian Hickson, the champion of the first camp, has outlined his 
 > | position [1]
 > | in a paper that seems to be the seminal claim for the notion 
 > | that XHTML
 > | cannot be served with the "text/html" MIME Type.  This paper 
 > | is often cited,
 > | but doesn't discuss content negotiation.  Ian has 
 > | substantiated and expanded
 > | his claim with a study of existing Web content (which he performed at
 > | Google), that seems to indicate that even content which is 
 > | meant to be XHTML
 > | (regardless of the MIME Type) is in the main not valid or 
 > | well-formed (IIRC,
 > | though I don't know how that compares with the 
 > | validity/well-formedness of
 > | existing HTML content).  The conclusion seems to be that XHTML, and by
 > | extension XML, is suboptimal, and that the path forward on 
 > | the Web should be
 > | based on HTML, which is viewable in legacy browsers (i.e. IE).
 > | 
 > | The other camp believes that the pragmatic benefits of serving XHTML
 > | outweigh the technicalities described by Ian.  They believe that the
 > | proliferation of XML-based tools and UAs, and the more 
 > | extensible nature of
 > | XML as regards namespaces and mixed content, as well as other 
 > | benefits of
 > | XML, are more compelling than the current state of some 
 > | browsers, which are
 > | subject to change.  The practicalities of this approach are 
 > | described by the
 > | Web Standards Project [2], which explicitly resolves the MIME 
 > | Type issue via
 > | content negotiation or a relaxed MIME Type.  Ian's larger 
 > | claim that even if
 > | served with the correct MIME Type, much existing content will 
 > | largely still
 > | not be viewable with existing browsers is not addressed by 
 > | this argument,
 > | but is presumed to be a transitional phenomenon.  
 > | 
 > | Thoughts?  Did I correctly frame the debate?
 > | 
 > | [1] http://www.hixie.ch/advocacy/xhtml 
 > | [2] http://www.webstandards.org/learn/articles/askw3c/sep2003/ 
 > | 
 > | Regards-
 > | Doug
 > | 
 > 

-- 
Best Regards,
--raman

Title:  Research Scientist      
Email:  raman@google.com
WWW:    http://emacspeak.sf.net/raman/
Google: tv+raman 
GTalk:  raman@google.com, tv.raman.tv@gmail.com
PGP:    http://emacspeak.sf.net/raman/raman-almaden.asc
Received on Thursday, 31 August 2006 23:17:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:10:20 GMT