Re: Last Call: draft-nottingham-http-link-header (Web Linking) to Proposed Standard

On Thu, 17 Sep 2009, Mark Nottingham wrote:
> On 16/09/2009, at 5:44 PM, Ian Hickson wrote:
> > 
> > > > Given that HTML and Atom only allow one of each attribute, I think 
> > > > we're far less likely to have bugs if we say that all but the 
> > > > first occurrance of each attribute is just ignored.
> > > 
> > > That's an artefact of their syntax (SGML and XML-based, 
> > > respectively); do you expect people to use the same parsers?
> > 
> > I expect implementations to use the same internal APIs, and those APIs 
> > are designed with the expectation of a single value for each 
> > attribute. For example, look for nsContentSink::ProcessLink() in the 
> > Mozilla codebase.
> You're proving my point; Mozilla has nsContentSink::ProcessLinkHeader, 
> which calls nsStyleLinkElement::ParseLinkTypes() (as does 
> ProcessLink()).
> That code does happen to only allow one rel value, but that's a bug, as 
> the current specification of the Link header in RFC2068 and earlier 
> clearly allow multiple rel attributes, and fixing it is a matter of a 
> line or two.
> Nevertheless -- how do other people feel about restricting to one rel 
> attribute in the Link header? Looking back at 2068, the BNF implicitly 
> allows multiple rel values, but it isn't explicitly addressed in the 
> prose, so Ian does have a point -- you could see it either way.

I'm not talking about one rel value, I'm talking about one attribute, of 
any of the attributes. One "hreflang", one "media", one "type", etc.

> > > As far as how it works in UI -- present the most appropriate title 
> > > based upon the users preferences.
> > 
> > Could you give a concrete example of a Link: header that uses multiple 
> > titles in this way and describe what you think the ideal UI for that 
> > example would be?
>    Link: </TheBook/chapter6>; rel="last"; title*=UTF-8'de'letztes%20Kapitel;
> title*=UTF-8'en'final%20chapter
> One example of a UI -- if it chooses to use the title instead of the URL 
> for displaying the link (e.g., in a toolbar dedicated to showing links) 
> would use the first title* if the users configured language was German, 
> and the second if it were English.

I am skeptical that this will ever be implemented in widely deployed 
consumer software, but ok.

> > [snip various answers saying that various things shouldn't be 
> > registered in the registry but that each registered link type should 
> > have a corresponding specification that defines how the link type 
> > works in all the situations that might support it]
> > 
> > I think this basically makes the registry worthless. At least for 
> > HTML5, there are several aspects that we need to have in a 
> > machine-readable fashion for each link type, including:
> > 
> > - whether the link type is allowed on <link>
> > - whether the link type is allowed on <a> and <area>
> > - whether the link type is a hyperlink or references an external resource
> > 
> > If the link registry isn't going to be providing this, then it's not 
> > really solving the problems for which HTML5 needs a registry.
> Just so I understand, a straw-man use case: if I wanted to define a 
> HTML5 extension that modified the page by fetching chunks, implemented 
> in JS (e.g., something functionally similar to 
> <>), I could register a relation 
> type for those links that included the fact that they're external 
> resources, so that browsers would automagically download them as well 
> when a user does "save as"?


> If that's in-scope, I guess it's nice-to-have, but it seems like a lot 
> of trouble. Won't most such extensions actually require browser changes, 
> thus reducing the value of having this information machine-readable? Do 
> you intend for this machine-readable information to be used at compile 
> time, or runtime?

A tool like 'wget' could automatically support things like <link 
rel=stylesheet>, <link rel=icon>, and whatever other such extensions 
browser vendors come up with, without having to ship new releases, if it 
could update its behaviour based on the registry.

> Also, what's the value of having the information about where a link is 
> allowed machine-readable? Won't it simply be ignored if it's not 
> understood and in a valid place?

An HTTP and HTML validator could automatically keep up to date with what 
keywords it should allow on <link>, <a>, <area>, and in Link: headers, if 
the registry included information about where such keywords were allowed.

> > > > Why should HTML5 define how rel=stylesheet of a CSS file applies 
> > > > to an XML document? That doesn't sound right at all.
> > > 
> > > My understanding was that HTML5 is not only defining the HTML 
> > > syntax, but also the model for what a "web browser" is and does; if 
> > > that's the case, it's one place to put it.
> > 
> > That is not the case. HTML5 defines HTML, and how all classes of HTML 
> > processors, whether search engines, editors, or anything else, are to 
> > process HTML.
> > 
> > It does not define how an XML+CSS processor is to handle an HTTP 
> > header.
> OK. The point is that some XML+CSS spec needs to define this behaviour.

I don't see why an XML spec or a CSS spec or even an XML+CSS spec would 
define the behaviour of an HTTP header. Surely the spec for the HTTP 
header would be the one to define the behaviour of the HTTP header.

> > > > Regardless of who manages the registry, I would like to request 
> > > > that the registration mechanism be made significantly simpler than 
> > > > the one described in the spec.
> > > 
> > > Do you have concrete suggestions for the IANA process to be used?
> > 
> > I would suggest that the IANA host a wiki that anyone can edit.
> The point of having a registry is to act as a gateway, to assure 
> appropriate review and coordination. Allowing anyone to edit the 
> contents removes these benefits; people wanting to mint new relations 
> without coordination or even review can do so by using extension 
> relations.

People will and do mint keywords without coordinating with us or asking 
for our review.

Our choice is just whether we want them to register these keywords first, 
or whether we want them to not register them first.

Whether they register them will depend on whether we make it easy or not.

I'm not making any judgements about whether this behaviour is desireable, 
good, bad, or the right way to do things. It's just how things are. 
Ignoring it is just going to make the registry irrelevant in practice.

Also, it's worth noting that the HTML5 RelExtensions registry today is a 
wiki. That registry allows people to register arbitrary keywords. If we 
want that registry to be a subset of the registry you are defining, then 
it needs to be as easy (or easier) to register things in the registry you 
are defining as in the RelExtensions wiki. Otherwise, people will register 
keywords for HTML's rel="" attribute, since doing so is trivial, but will 
not bother to register them in the "higher level" registry, because doing 
so will be more work than they want to do.

If this happens, then either the goal of preventing clashes between HTML 
and Atom won't have been met (at least, not by the registry with the more 
onerous registration requirements).

> > > BTW, I was wandering through 
> > > <> and found 
> > > "sidebar." Why use a link relation type here when the target 
> > > attribute is already available?
> > 
> > The target="" attribute would create a new window instead of 
> > gracefully falling back to using the current window like rel="".
> Given that the intent of a sidebar is to open it in a different pane 
> without replacing the current content, is that a bad thing?

Apparently. A number of UAs tried using target="", and all gave up.

On Mon, 21 Sep 2009, Mark Nottingham wrote:
> I've just added:
> The "rel" parameter MUST NOT appear more than once in a given 
> link-value; occurrences after the first MUST be ignored by parsers.
> and adjusted the examples to suit. Does that address your concern?

I couldn't find the updated I-D on; is this the version 
with this fix?:

In general, if the attributes can each only be listed once, and the spec 
says how to handle (e.g. ignore) duplicates, then my concern is addressed.

> I've just changed the 'title'-related text to:
> The "title" parameter, when present, is used to label the destination of 
> a link such that it can be used as a human-readable identifier (e.g. a 
> menu entry). The "title" parameter MUST NOT appear more than once in a 
> given link-value; occurrences after the first MUST be ignored by 
> parsers.
> The "title*" parameter MAY be used encode this label in a different 
> character set, and/or contain language information as per <xref 
> target="RFC2231"/>. When using the enc2231-string syntax, producers MUST 
> NOT use a charset value other than 'ISO-8859-1' or 'UTF-8'. The "title*" 
> parameter MAY appear more than once in a given link-value, but each 
> occurrence MUST indicate a different language; occurrences after the 
> first for a given language MUST be ignored by parsers.
> When presenting links to users, agents SHOULD use the most appropriate 
> "title*" value, according to user preferences. If an appropriate 
> "title*" value cannot be found, the "title" parameter's value, if 
> available, can be used.
> Does this work?

Seems reasonable, though I am still skeptical as to the use of the title* 
feature in practice. It seems better to me to just have one title 
attribute, in one language, and to upgrade HTTP to support UTF-8 in 

Ian Hickson               U+1047E                )\._.,--....,'``.    fL       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Thursday, 24 September 2009 09:27:33 UTC