Re: I-D ACTION:draft-nottingham-http-link-header-01.txt from Phil Archer on 2008-04-29 (ietf-http-wg@w3.org from April to June 2008)

From: Phil Archer <parcher@icra.org>
Date: Tue, 29 Apr 2008 16:03:00 +0100
To: Brian Smith <brian@briansmith.org>
CC: 'Mark Nottingham' <mnot@mnot.net>, 'HTTP Working Group' <ietf-http-wg@w3.org>
Message-ID: <481738A4.4080109@icra.org>
Brian Smith wrote:
[..]
> 
> The example in the Link header proposal was: 
> 
>     Link: </foo>; rel="http://example.com/profile1/foo"
> 
> How would we convert this to a "foo-Link"? How about just dropping the
> "http://" as noise, and then replacing all the characters that are illegal
> in HTTP header field names with "-"?:
> 
> example.com-profile-foo: /foo
> 
> I think that this is a nice compromise between the ease of processing,
> readability, while making collisions extremely unlikely. The worldwide Java
> programming community has used a very similar naming convention and it has
> worked out very well over the last ten years.

I can see that that would work. Would IANA registered links therefore be 
as simple as:

stylesheet: /styles.css  ?

This has the obvious attraction of simplicity but I have a nagging fear 
that if it's simple to get right, it makes it all the more simple to 
(perhaps deliberately) get it wrong and we end up with meaningless 
headers - but I guess mod_headers makes this perfectly possible now so 
maybe this isn't an issue.

Specifying something with the full IANA namespace would become

www.iana.org-assignments-link-relations.html-stylesheet: /styles.css

Hmmm...

Cheers

Phil.

>>
>> Brian Smith wrote:
>>> [CC'd to ietf-http-wg. I think it is better to continue the 
>> conversation
>>> over there as most of my comments are not specific to Atom.]
>>>
>>> Mark Nottingham wrote:
>>>> The draft does not advocate removing links from Atom 
>>>> documents to put them in headers; rather, the common
>>>> use case is repeating them in headers, so that they
>>>> can be easily discovered and processed.
>>> With HTML, I've never seen the Link header used that way; 
>> it has always been
>>> used to add new links to the document (usually style sheets 
>> that vary
>>> depending on the UA).
>>>
>>> When processing Atom documents, we are usually more interested in
>>> atom:title, atom:content/@src (if any), and 
>> atom:content/@type. In AtomPub
>>> we also often want atom:link/@rel='edit' and 
>> atom:link/@rel='edit-media'.
>>> Since the Link header can only store the atom:link 
>> elements, we are almost
>>> always going to have to parse the document anyway. If 
>> parsing Atom (or HTML)
>>> is problematic for the software application then the 
>> application shouldn't
>>> have chosen to store everything in Atom (or HTML) documents. :)
>>>
>>>>> For all those reasons, I actually think it makes a lot more 
>>>>> sense for the Link header registry to be mutually exclusive
>>>>> with the HTML and Atom registry, instead of attempting to
>>>>> merge them all together.
>>>> You're the first person to suggest that. I think we can get 
>>>> to a place where there's alignment between the specs without
>>>> abusing the semantics of existing relations. It's certainly
>>>> worth trying...
>>> It seems like a lot of effort just to (re-)define all the 
>> link relations in
>>> a format-agnostic way without being overly vague. It is 
>> probably even more
>>> work to convince everybody (especially the HTML WG) to 
>> agree to the result.
>>> I think it would be nice if the same link relation 
>> identifiers meant the
>>> same thing in Atom as they do in HTML. However, for most of 
>> the existing
>>> registrations, I don't see the advantage to also making 
>> them available in
>>> the HTTP message header. 
>>>
>>> Last Friday I implemented support for the Link header in a 
>> simple AtomPub
>>> application. Now I will take an even stronger stance: its 
>> use should not be
>>> encouraged at all. It is much simpler to process hyperlinks 
>> that use the
>>> "Relation: URI" syntax like Location and Content-Location 
>> than it is to
>>> process hyperlinks that use the Link header. For example, 
>> writing Python
>>> middleware or Apache mod_rewrite/mod_headers rules to 
>> filter/add/remove
>>> links is much harder using the Link header than when using the
>>> Location-header approach:
>>>
>>> 1. There is too much flexibility in the syntax of the "rel" 
>> parameter. For
>>> example, the following all mean the same thing:
>>>       rel=edit
>>>       rel="edit"
>>>       rel="\e\d\i\t"
>>>       rel="http://www.iana.org/assignments/link-relations.html#edit"
>>>       ....
>>> If you want to be able to catch all variations, then you 
>> have to write a
>>> pretty nasty regular expression.
>>>
>>> 2. The Link header mixes unrelated information into the 
>> same header field.
>>> Consequently, in order to process specific types of links, 
>> you have to parse
>>> the Link header field into parts, process the parts that 
>> you are interested
>>> in, and put it all back together.
>>>
>>> 3. The "rev" mechanism makes processing unnecessarily 
>> difficult. You have to
>>> be careful to note whenever rev=A means the same thing of 
>> rel=B when you are
>>> attempting to process the header.
>>>
>>> I think a better alternative to a single "Link" header is 
>> to define a
>>> standard for multiple Link-like headers:
>>>
>>> [Relation]-Links: #(URI-Reference LWS *(; param=value LWS))
>>>
>>> For example, an "edit" link would be:
>>>
>>> Edit-Links: http://foo.org
>>>
>>> This could be done by changing the registration rules for 
>> HTTP headers so
>>> that header fields with a "-Links" suffix must have the 
>> above syntax, with
>>> the definitions of the "media", "type", and "title" 
>> parameters to be the
>>> fixed to be the same as in HTML 4 (or 5) and Atom 1.0. Each 
>> link header
>>> would have to define the processing rules for when multiple 
>> links are
>>> provided, and applications must be prepared to handle 
>> multiple links of the
>>> same type, even when they are not expected (that is why I 
>> chose "-Links"
>>> instead of "-Link").
>>>
>>> Try to write a mod_headers rule or Python WSGI middleware 
>> that filters out
>>> all the links with a particular type. Using the "-Links" 
>> header syntax, it
>>> is just "del environ[HTTP_RELATION_NAME_LINKS]" in Python and "unset
>>> RELATION-NAME-LINKS" in mod_headers. The Link header 
>> version requires a some
>>> tricky parsing in Python. I think it is actually impossible 
>> to process the
>>> Link header correctly using Apache's mod_headers.
>>>
>>> I think the "-Links" header idea allows for uniform syntax 
>> (like the Link
>>> header) while still being extremely easy to process.
>>>
>>> Thoughts?
>>>
>>> - Brian
>>>
>>>
>>>
>>>
>>
>>
> 
> 
> 

-- 
Phil Archer
Chief Technical Officer,
Family Online Safety Institute
w. http://www.fosi.org/people/philarcher/
Received on Tuesday, 29 April 2008 15:03:51 UTC