Re: Hyperlinks and content negotiation from Smylers on 2009-10-17 (public-html@w3.org from October 2009)

From: Smylers <Smylers@stripey.com>
Date: Sat, 17 Oct 2009 15:41:53 +0100
To: Mike Kelly <mike@mykanjo.co.uk>, public-html@w3.org
Message-ID: <20091017144153.GH13561@stripey.com>
Mike Kelly writes:

> Smylers wrote:
> 
> > Mike Kelly writes:
> >
> > Content negotiation could succeed if only those who know what they are
> > doing touch it, that typical authors aren't somehow tempted to start
> > playing with it.  That's possible, but not certain.  I don't know how
> > we'd gather data either way.
> 
> Content negotiation exists as a standardized feature in the HTTP spec.

HTML5 is designed on what is useful or expedient in practice, not what
is in other specs (though often there is a big overlap, because the
other specs also match reality).

That HTTP has this feature doesn't matter; what matters is whether
authors would use it, and whether its existence would cause harm.

> If there are aspects of HTTP that you think are unnecessary or wrong,
> and need addressing - this should be taken up with the relevant bodies
> controlling the HTTP spec - any other approach to 'solving' these
> /perceived/ problems , regardless of intention, is bad (and
> potentially damaging) governance practice.

HTML5 covers features which are useful to web authors.  In some cases
that involves interacting with or specifying features in other specs,
such as HTTP.  But that HTML5 relies on HTTP it doesn't follow that
HTML5 has to enable a way of using every feature that HTTP provides.

Equally, nor does it follow that not providing for a particular HTTP
feature in HTML5 is labelling it "unnecessary or wrong"; it's simply one
which is deemed not to be relevant enough for HTML5.

> Meanwhile; I think it would be most productive for HTML to recognize
> its important role in driving HTTP applications, and look to provide
> (where possible) standardized mechanisms by which developers can
> leverage all relevant features of HTTP.

Well obviously you think that would be productive, since you want the
features!

I can counter that by pointing out I think it would be unproductive.

That doesn't really get us anywhere -- what you think and what I think
is equally valid.

That's why it's better to have data for this sort of thing: if a feature
would be useful to many HTML authors, safe, backwards-compatible, etc
then it can be added on its own merits, without needing to be tret as a
special case for being in some other spec.

> I think conneg is a relevant, valuable feature of HTTP that HTML5 is
> capable of provisioning for, at relatively little risk/cost.

In that case try to think of ways showing how valuable it would be, and
how low the risk.

> > > It is not that using separate URIs "doesn't work", just that it
> > > may be a sub-optimal for a particular system that would benefit
> > > more from a strictly standardized distinction between resources
> > > and representations.  A clear distinction between the two allows
> > > intermediaries to make valuable, automated assumptions about the
> > > significance of a request.
> >
> > Please could you be more specific about these assumptions and their
> > value.  HTML5 is designed by finding problems that need to be solved
> > first, and then looking for solutions to those problems.
> >
> > (In this case it sounds like content negotiation may be the only
> > solution to the particular problem, but for the rigor of the spec we
> > don't want to add features without being sure what they are for and
> > that they are the best way of solving the problem.)
> > 
> > > > In what way does it help for a cache to cache a blog's homepage
> > > > and feed labelled with the same URL compared with caching them
> > > > with separate URLs?
> > > 
> > > The benefits are realized in terms of automated cache
> > > invalidation.  Modifying a resource should automatically
> > > invalidate all of its representations.
> >
> > Thanks -- that makes sense.  You mention "assumptions" in plural
> > above, so I presume there are others?
> 
> Plural for caches in the sense that various HTTP request methods could
> cause invalidation (i.e. POST/PUT/DELETE)

Surely that's true for all URLs -- that one method can cause
invalidation for other methods applies just as much to a single page?
It isn't specifically an advantage of two different formats of the same
page sharing a URL.

So the problem you actually want to solve is to invalidate all formats
of some content when any of them change.

Content negotiation would solve that for content whose formats have
different media types.  It doesn't solve it for content available in
multiple formats, all of which are HTML (for example a long article
which is available either paginated or as a single page, or content with
a 'printer friendly' version, or content in multiple human languages).

> And also in the sense that other types of intermediary mechanism could
> leverage a standardized distinction - e.g. proxy routing rules, etc. I
> guess these could be explained in more detail if necessary.. but is
> automated cache invalidation not a valuable enough example on its own?

Possibly not; it depends to some extent on how many authors actually
want to solve that problem.

Providing more problems that you wish HTML to solve may result in HTML
being changed to solve them.  Keeping them to yourself is unlikely to!

> > > It's not a perfect solution to all problems - it's a trade-off.
> > > If highly-efficient automated caching is more valuable to your
> > > system than being able to avoid the highly risky world of plain
> > > text URIs and grumpy twitter users, then there is an obvious
> > > choice to be made.
> >
> > That sounds fair enough.  Do you have any evidence of the numbers of
> > developers who would choose the cache-invalidation advantage over
> > the plain-text URL advantage?
> 
> No - but that is not at all surprising given that it isn't a viable
> option right now!

Evidence would include things like this being a common problem which web
developers encounter and ask about on fora for suggestions of what to do
about it.  Or that developers have resorted to tracking these sorts of
dependencies on the server-sid -- perhaps evidenced by the existence of
libraries for doing this.  Or a list of sites which publish the same
content in multiple formats, which separate mime types, and where caches
often have an up-to-date version of one format but incorrectly are still
caching an old version of another.

> Is this even necessary if we are in agreement that the caching use
> case makes sense, and has significant value?

There are many problems which it would be nice for HTML to solve.  It
can't solve all of them.  Bigger problems are more worth solving.

"Significant value" is contentious; you can claim something is
significant and somebody else can claim it's insignificant.  Whereas if
you provide some data to back your claim, its significance is more a
matter of fact than opinion.

> > Unfortunately HTML5 can't cater for every valid requirement, so
> > generally doesn't add features that would be useful to only a very small
> > number of authors (for example HTML5 doesn't add a <ship> element,
> > despite some authors having a very valid requirement to distinguish
> > names of ships on their pages; mentioning ships simply isn't common
> > enough).
> 
> I understand the point you are making

Good.

> but don't feel that is a sensible or helpful comparison for this case.

Yeah, I don't think it's a particularly good example either, but since
you understood me anyway (thanks!) I don't have to think of a better
one!

Cheers.

Smylers
-- 
http://twitter.com/Smylers2
Received on Saturday, 17 October 2009 14:42:24 UTC