W3C home > Mailing lists > Public > public-bpwg-ct@w3.org > September 2007

Re: [CT] Using robots.txt to flag an adapting site

From: Phil Archer <parcher@icra.org>
Date: Sun, 30 Sep 2007 15:10:51 +0100
Message-ID: <46FFAE6B.2050808@icra.org>
To: public-bpwg-ct@w3.org, public-powderwg@w3.org, BPWG <member-bpwg@w3.org>
CC: Arun Ranganathan <arunranga@aol.com>

As with many things in the last week or so, I must apologise for a slow
response (blame lots of travel and all day meetings). Anyway...

As Jo noted, POWDER has just moved a couple of docs to first public 
working draft. With this done, draft versions of both our key tech 
documents are in the public domain and available for comment. We're 
trying to get the Grouping of Resources doc [1] to Last Call this week 
although next week (week commencing 8 Oct) is perhaps more likely. The 
target is to take comments up until the Tech Plenary. The main 
Description Resources doc [2] has only just become a public WD and will, 
obviously, need more work until it can go to LC. It has a section that 
is particularly relevant to the issues of link/rel headers and HTTP 
headers. Here's the state of play as I understand it:

1. The HTML 5 draft removes HTML 4's 'profile' - which allows you to 
specify a metadata profile for a doc. N.B. This does NOT allow you to 
specify a relationship type (although there seems to be a working 
assumption that it does). With Arun's help - and some internal contact 
within W3C - we're hopeful of making sure that the relationship type 
will be officially extensible. In practice, it seems that even now, if 
enough people use the same relationship type, it becomes a de facto 
standard. Not ideal - but rel="meta" is well understood without, I 
believe, being standardised anywhere.

2. The status of using the Link HTTP Response header is unclear. It was 
included in an earlier RFC but is not in the current one. There is 
active discussion within the W3C's HTTP WG over this issue and it seems 
that the draft they're working on now [3] is likely to re-instate the 
header formally. There's more detail on this in the relevant section of 
the DR doc [4].

So, from a BPWG perspective, linking to an alternative version shouldn't 
be restricted to HTML docs.

Robots.txt is mentioned in the POWDER charter. The essential aim is to 
eliminate the idea of well known locations - P3P and robots.txt have 
theirs but the idea is to obviate the need for any more. POWDER can 
handle this and we'll explain how in the Primer document that will soon 
be started. Essentially, a content provider will be able to say 'my 
robots.txt information is at <location1>, my p3p file is at <location2>' 
  etc. This is an application of POWDER, not a specific feature, even 
so, we'll look for any hooks we need to add to make to easier.

This thread raises the issue of contextual access which *is* a specific 
feature - e.g. the ability to react differently dependent on whether a 
particular UA string is present (one might posit other conditions, such 
as user's location, time of day etc.) This should be, but is not yet, 
included in one of our docs. We'll have a little discussion about where 
people think it should go - I can think of arguments for both cases but 
in general, I reckon the Grouping doc is probably the best place for it 
- which means you can expect to see this resolved within the 6 days or 
so as we do our best to stick to the schedule.

As a result of all this, you will be able to declare, in a 
machine-processable way, that 'all resources on example.org are 
mobileOK. Those on example.org/foo/* are self-adapting and should not 
further adapted; those on example.org/bar/* are mobileOK Pro if accessed 
using any of the following devices [List] and mobileOK Basic if accessed 
on any other.

Whether POWDER actually has any impact on the widely used and understood 
robots.txt remains to be seen. It has that potential to do so but only 
if POWDER as a whole becomes widespread.

Phil.
[1] http://www.w3.org/2007/powder/Group/powder-grouping/20070919.html
[2] http://www.w3.org/TR/powder-dr/
[3] 
http://www.w3.org/Protocols/HTTP/1.1/rfc2616bis/draft-lafon-rfc2616bis-03.html
[4] http://www.w3.org/TR/powder-dr/#assoc


Jo Rabin wrote:
> Again, "I don't disagree" though I prefer HTTP only mechanisms as they
> can be used on content that is not HTML, such as images. I think the
> mechanisms need to work equally well on images as on HTML. Consequently
> anything that focuses on features of HTML such as the DOCTYPE or link
> element is not as general a solution. Though I agree that they provide
> "contributory evidence".
> 
> Jo
> 
> 
> ---
> Jo Rabin
> mTLD (http://dotmobi.mobi)
> 
> mTLD Top Level Domain Limited is a private limited company incorporated
> and registered in the Republic of Ireland with registered number 398040
> and registered office at Arthur Cox Building, Earlsfort Terrace, Dublin
> 2.
> 
> 
>> -----Original Message-----
>> From: Sean Owen [mailto:srowen@google.com]
>> Sent: 29 September 2007 15:06
>> To: Jo Rabin
>> Cc: public-bpwg-ct@w3.org; public-powderwg@w3.org; BPWG; Rotan
> Hanrahan
>> Subject: Re: [CT] Using robots.txt to flag an adapting site
>>
>> On 9/29/07, Jo Rabin <jrabin@mtld.mobi> wrote:
>>> I "don't disagree" with use of the link header. However I am not
> sure it
>>> has as much flexibility as one would like. The semantics are also a
>>> little cloudy in my view - is it really appropriate to infer from
> the
>>> fact that there is link specifying an alternate with media handheld,
>>> that this version is not itself suitable for handheld? I might
>>> conceivably have versions for desktop, iPhone, series 60 and DDC,
> for
>>> instance.
>> It gets better -- really, the suggestion is to put the link to the
>> handheld version *in the handheld version.* The idea is that if a
>> transcoder is about to transcode a page talking about a handheld
>> alternate, it should merely get out of the way and redirect to that
>> target. Hence it becomes a means for a mobile page to say "hands off."
>>
>> An HTTP header is possible though the advantage of <link> is that can
>> be authored into the page. Also it does have the desired effect on
>> GWT.
>>
>>
>>> A plea not to ignore the work of the POWDER group here who as I
>>> understand it have been chartered to replace robots.txt and who I
> know
>>> have a way of describing parts of site by URI matching. Various
> POWDER
>>> documents have been elevated recently, and I expect you'll find them
> a
>>> jolly good read.
>> I had also missed Rotan's point entirely about robots.txt being
>> site-wide mechanism, which is valuable. And now I agree with this
>> point, that this becomes close to a special case of POWDER, and a
>> lovely first application?
> 
> 
> 

-- 
Phil Archer
Chief Technical Officer,
Family Online Safety Institute
w. http://www.fosi.org/people/philarcher/

Register now for the first, annual Family Online Safety Institute
Conference and Exhibition, December 6th, 2007, Washington, DC.

Go to: http://www.fosi.org/conference2007/ today!
Received on Sunday, 30 September 2007 14:11:10 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:10:36 GMT