HTML variants and content negotiation

In the content negotiation subgroup we are attacking the problem of analysing
the con-neg proposals in HTTP/1.1.  We all can probably agree that it's 90%
of the way there with regards to accomplishing its mission, which is
negotiation at the granularity of the content-type.  We're still undecided
about how far that granularity goes - what do parameters on Accept:
content-types mean, for example. 

However, there still looms the issue of HTML variants, "towards graceful 
deployment of new features".  We haven't talked that much about it simply 
because we simply don't have any proposals to give feedback on.  We 
agreed at the phone conversation that it was an "interesting" topic.  
However, the solution set for the problem also crosses heavily into the 
HTML arena, so it seemed like bringing this solution set to a larger 
audience would be a Good Thing.


Here are the goals I've seen stated or implied:

1) The solution must be a reasonable replacement for user-agent 
negotiation, for a majority of the cases.  Specifically not required to 
be handled are bugs (i.e., the AOL browser's forms implementation is 
broken)

2) The solution must require a minimal amount of work for browser 
authors, a less-than-moderate amount of work for server authors 
(given that there are many more browser authors than server authors :)
and allow for a range of efforts on the document-author side - so that 
the base case is easy to handle, yet document authors can selectively 
apply effort to making their documents more "portable".

3) The solution must address caching, with the goal of reducing the 
amount of bandwidth and the number of cached items per resource.


There are basically three routes to go, as I see it:

I.  The client indicates to the server a list of HTML extensions it 
understands.  This vocabulary of extensions is registered by some 
impartial body, say IANA, in the same way SMTP EHLO extensions are.  The 
server is responsible for delivering content that the browser can 
understand.  The response must declare which extensions the page 
depended upon, so caches can know when a response can be served locally.


II.  Introduce conditional constructs to HTML.  Basically create a new 
content-type, text/cond-html, say, and have it use either marked sections 
or PI's to implement a IF(feature|NOT feature), THEN (block) ELSE 
(block).  The "feature" would again be a registered keyword, which 
browsers would be responsible for setting appropriately.  Browsers which 
supported cond-html would indicate so in their accept headers of course, 
so there's still a big role for content negotiation.


III.  Recommend that all (or as many as possible) non-backwards-compatible
extensions (like say maths) are implemented not as new HTML extensions, but
as new unique data types which are INSERTed into documents, allowing for
content-type-granularity content negotiation to work.  For extensions 
which are essential to HTML, use the "level" or "version" parameter to 
distinguish, with the idea that this is an infrequent occurance.



Analysis of each:

I - Positives:
	Browsers don't have to deal with unknown constructs	
	A "smarter" version of user-agent negotiation, since caches have
	  a chance of being able to cache something correctly.
	Provides an easy way for browser authors to introduce new
	  features without being labelled as a destabilizing force, and
          for "second to market" browsers to be able to implement new 
          features without having to lie about their user-agent.
    Negatives:
	Browsers *could* be incorrect - i.e., something which said it
	  could handle "tables" might not be able to handle 
	  tables-within-tables, or inlined objects in tables (i.e. XMosaic 2.7)
	Document authors need to be able to express (easily) which feature
	  sets the document uses.  Either that, or servers need to parse
	  documents as they are served, staying aware of what tags map to
	  which IANA-registered feature sets.
	Header bloat is an issue - the feature set vocabulary could be huge, 
	  already a problem with Accept: headers.

II - Positives:
	Reduces the processing requirements on servers - servers can
	  be "dumb", and thus more scalable.
        Proxy caches don't have to worry about feature-set negotiation
	  either - one document, not 2^(# feature-sets)	possible documents.
	Relatively easy to implement in browsers. (conjecture)
	If a browser thinks it can handle a particular feature, but ends up
	  giving an error, it can default to the counterpart easily.

     Negatives:
        File bloat - conceivably documents could be 2-3 times as large as
	  normal, since the client will be throwing away what it doesn't 
	  understand.

III - Positives:
	Allows regular content-type negotiation to handle varying 
          capabilities in user-agents.
	Part psychological - gets users and developers to stop looking at
	  HTML as a "kitchen sink"
	No special requirements on caches.
	Bandwidth is minimized.
      Negatives:
	Requires implementation in browsers of EMBED/INSERT in browsers, which 
          is not trivial.
	Compound document authoring tools not pervasive - instead of one
	  file, we will have many files combined into one, possibly making
          management more difficult for the average user.

We've had a fair amount of discussion on www-talk and the various 
IETF lists about each of these three.  What benefits/drawbacks did I miss?
Obviously since none of these are implemented widescale some of the 
benefits and drawbacks may be speculative, and I'd rather argue from 
experience than speculation...

One problem, of course, is that we're essentially trying to decide where 
a problem gets solved - at the document level or at the transmission 
level.  Thus it may seem odd for an HTTP subgroup to say "solve this in 
HTML", or an HTML subgroup to say "solve this in HTTP".  We need 
leadership from members of both communities, who understand the strengths 
and limitations of both systems, to decide where this problem gets 
solved.  

My personal analysis is this: #2 represents the best choice because it 
makes available to the user-agent *all* the information it needs to 
handle the rendering, giving it the power to make decisions as to which 
features to parse and which to not.  This is even a solution to the 
problem of buggy browsers: XMosaic 2.7 right now can handle text in 
tables (barely), but not hyperlinks or inlined images.  It could have the 
TABLES variable set to INCLUDE, and if it ran into a construct it 
couldn't handle, it could go back and turn it off, handling the 
non-tables version of the negotiated module.  The most common way to 
implement #1 will probably be through the use of server-parsed 
conditional HTML anyways, so doing conditional HTML in #2 does not 
represent more difficulty for the document author.  #2 has already been 
implemented piecemeal - notice the NOEMBED tag in EMBED.  There are also 
legal reasons why giving all the information to the clients is a Good 
Thing.  #2 is also much friendlier to caches, as it requires no changes 
and keeps the number of possible variants per resource low.

We can still recommend that #3 be persued, but requiring it in the absence of
#2 has sounded politically impossible.  I'm hopeful about the progress made
on stylesheets and the INSERT proposal to enable that, but I don't want to
chill enhancements to HTML too. Setting up the infrastructure to support #1
may be too costly to simply say "sure, let's do that too".  #1 and #2 
aren't strictly exclusive but politically/psychologically they might be.


I hope this post ties together some previously disparate threads, and 
lets us compare.  Which do you prefer?  Which do you find more elegant?
I'd specifically like to hear from browser authors and people who author 
large collections of documents.

	Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com  brian@hyperreal.com  http://www.[hyperreal,organic].com/

Received on Monday, 8 January 1996 02:42:41 UTC