Re: link test suite

On Monday 11 February 2008, olivier Thereaux wrote:
> Hi Ville,
>
> On Feb 9, 2008, at 21:39 , Ville Skyttä wrote:
> > I have always thought base/@href as something that does not need to be
> > dereferenceable.
>
> That's a good point, and thanks for the example - it is convincing.
> I basically took all the %URI attributes in the HTML 4.01 DTD and
> tested for them, but indeed, maybe base href="" should not be checked
> directly.

I think there are a few more of the kind, like codebase for applet and object.  

Some other %URI that at least need some thought:

Form actions: may require some values submitted from the form's inputs, and 
may not be safe to invoke anyway, at least not when method="post".

Usemap for img, object and input: URI in HTML and XHTML 1.0 but IDREF in XHTML 
1.1.

> That said I wonder if we could somewhat have a UI stating which Base
> URI we are using for each document, depending on whether the checker
> found Content-Location headers, or <base> element ?

Ouch ;).  I just educated myself over the weekend and found out how terribly 
convenient $response->base() is (see HTTP::Response documentation) and 
thought I'd dump our <base> handling altogether in favour of it later in 
checklink, dunno if the info where our base came from could be sanely figured 
out if it was used.  But I agree the info would be useful.

BTW, $response->base() parses <base> tags only for text/html at the moment, 
but I already have a patch queued for submission upstream at 
http://scop.fedorapeople.org/patches/lwp5/headparser.patch which makes it do 
the same for application/xhtml+xml and application/vnd.wap.xhtml+xml docs.


A bit off topic: Regarding codebase for applets and objects, I implemented 
taking it into account in checklink in the weekend.  However I find the 
docs/implementations for it somewhat mismatched; let's say for document at 
http://.../foo/baz.html, <object codebase="bar" data="quux"> (note no 
trailing slash in "bar") should IMO result in the whole URL to the object's 
data be resolved to http://.../foo/quux because codebase is a "base URI":

1) bar relative to http://.../foo/baz.html: http://.../foo/bar
2) quux relative to http://.../foo/bar: http://.../foo/quux

However, browsers seem to always treat codebase as a directory (ie. as if it 
had a trailing slash), resulting in http://.../foo/bar/quux .  Thoughts?

Received on Monday, 11 February 2008 00:32:31 UTC