Re: I-D ACTION:draft-nottingham-http-link-header-01.txt from Mark Nottingham on 2008-04-29 (ietf-http-wg@w3.org from April to June 2008)

From: Mark Nottingham <mnot@mnot.net>
Date: Tue, 29 Apr 2008 10:23:00 -0700
To: Brian Smith <brian@briansmith.org>
Cc: "'atom-syntax Syntax'" <atom-syntax@imc.org>, "'HTTP Working Group'" <ietf-http-wg@w3.org>
Message-Id: <D5B113A3-F955-415B-9A0F-4B6A9DA794C7@mnot.net>
Reply-To set to ietf-http-wg, to avoid crossposting.

On 27/04/2008, at 8:39 PM, Brian Smith wrote:

> [CC'd to ietf-http-wg. I think it is better to continue the  
> conversation
> over there as most of my comments are not specific to Atom.]
>
> Mark Nottingham wrote:
>> The draft does not advocate removing links from Atom
>> documents to put them in headers; rather, the common
>> use case is repeating them in headers, so that they
>> can be easily discovered and processed.
>
> With HTML, I've never seen the Link header used that way; it has  
> always been
> used to add new links to the document (usually style sheets that vary
> depending on the UA).

My experience is the opposite; seeing people put a link in headers but  
omit it in content is more rare. The only common exception is when  
it's not possible to put the link in content (e.g., it's an image).


> When processing Atom documents, we are usually more interested in
> atom:title, atom:content/@src (if any), and atom:content/@type. In  
> AtomPub
> we also often want atom:link/@rel='edit' and atom:link/@rel='edit- 
> media'.
> Since the Link header can only store the atom:link elements, we are  
> almost
> always going to have to parse the document anyway. If parsing Atom  
> (or HTML)
> is problematic for the software application then the application  
> shouldn't
> have chosen to store everything in Atom (or HTML) documents. :)

Are you saying that there are no use cases for a link header in Atom?  
I find that hard to believe (e.g., link with a relation of 'edit- 
media' on a media response seems obvious and useful), but if it's so,  
perhaps the simple answer is to use a separate registry for the  
header, with the option of registering Atom relations later.


>>> For all those reasons, I actually think it makes a lot more
>>> sense for the Link header registry to be mutually exclusive
>>> with the HTML and Atom registry, instead of attempting to
>>> merge them all together.
>>
>> You're the first person to suggest that. I think we can get
>> to a place where there's alignment between the specs without
>> abusing the semantics of existing relations. It's certainly
>> worth trying...
>
> It seems like a lot of effort just to (re-)define all the link  
> relations in
> a format-agnostic way without being overly vague. It is probably  
> even more
> work to convince everybody (especially the HTML WG) to agree to the  
> result.

I don't disagree, but it is worth a try.


> I think it would be nice if the same link relation identifiers meant  
> the
> same thing in Atom as they do in HTML. However, for most of the  
> existing
> registrations, I don't see the advantage to also making them  
> available in
> the HTTP message header.

The current HTML registrations are grandfathered in, so that's to be  
expected; many of them are there more to prevent future collisions  
than they are to provide anything new.

[...]
> 1. There is too much flexibility in the syntax of the "rel"  
> parameter. For
> example, the following all mean the same thing:
>      rel=edit
>      rel="edit"
>      rel="\e\d\i\t"
>      rel="http://www.iana.org/assignments/link-relations.html#edit"
>      ....
> If you want to be able to catch all variations, then you have to  
> write a
> pretty nasty regular expression.

What leads you to believe that "\d" is the same as "d" in a URI  
reference? Regardless, regex is often the wrong tool for parsing.


> 2. The Link header mixes unrelated information into the same header  
> field.
> Consequently, in order to process specific types of links, you have  
> to parse
> the Link header field into parts, process the parts that you are  
> interested
> in, and put it all back together.

That's true of just about any delimited text format.


> 3. The "rev" mechanism makes processing unnecessarily difficult. You  
> have to
> be careful to note whenever rev=A means the same thing of rel=B when  
> you are
> attempting to process the header.

There seems to be pretty strong support for dropping 'rev', so this  
isn't a concern.


> I think a better alternative to a single "Link" header is to define a
> standard for multiple Link-like headers:
>
> [Relation]-Links: #(URI-Reference LWS *(; param=value LWS))
>
> For example, an "edit" link would be:
>
> Edit-Links: http://foo.org
>
> This could be done by changing the registration rules for HTTP  
> headers so
> that header fields with a "-Links" suffix must have the above  
> syntax, with
> the definitions of the "media", "type", and "title" parameters to be  
> the
> fixed to be the same as in HTML 4 (or 5) and Atom 1.0. Each link  
> header
> would have to define the processing rules for when multiple links are
> provided, and applications must be prepared to handle multiple links  
> of the
> same type, even when they are not expected (that is why I chose "- 
> Links"
> instead of "-Link").
>
> Try to write a mod_headers rule or Python WSGI middleware that  
> filters out
> all the links with a particular type. Using the "-Links" header  
> syntax, it
> is just "del environ[HTTP_RELATION_NAME_LINKS]" in Python and "unset
> RELATION-NAME-LINKS" in mod_headers. The Link header version  
> requires a some
> tricky parsing in Python. I think it is actually impossible to  
> process the
> Link header correctly using Apache's mod_headers.
>
> I think the "-Links" header idea allows for uniform syntax (like the  
> Link
> header) while still being extremely easy to process.


This approach didn't work terribly well for entity headers, and would  
further clog up the header registry. It would also require people to  
register their links in at least two places, and coordinate them over  
time.

Changing the registration procedures is also not a small thing, since  
they already have IETF consensus. Similar proposals to the ones being  
put forward (e.g., com.foo.link:) have been put forward for media  
types, and have been shot down quite firmly.

Cheers,

--
Mark Nottingham     http://www.mnot.net/
Received on Tuesday, 29 April 2008 17:23:40 UTC