Re: siteData-36: strawman from Sandro Hawke on 2003-03-01 (www-tag@w3.org from March 2003)

From: Sandro Hawke <sandro@w3.org>
Date: Sat, 01 Mar 2003 07:29:14 -0500
To: Tim Bray <tbray@textuality.com>
cc: Karl Dubost <karl@w3.org>, www-tag@w3.org
Message-Id: <200303011229.h21CTEH03685@wadimousa.hawke.org>

> > 3. Distributed Web site.
> >     Imagine now that the notion of Web sites is a collection of resource 
> > on a topic that people has agreed to put together.
> 
> Exactly.
> 
> >     Abuse. Exactly like the keywords system in meta.
> >     People will start to declare that they belong to a Web site to be 
> > automatically indexed in the system which put the resources together.
> 
> Good point.  So crawlers would probably require that the site resource 
> point back to the member resources...
> > 
> >     So if you have an URI which says we are all part of this Web site, 
> > Web site in this case meaning the 3rd option. What will be at the end of 
> > this URI ? A mechanism which says: "Hey yes, all this list of URIs 
> > belong to the same site."  ?
> 
> ... as you suggest.

Karl expresses concern that people will add themselves to websites; as
stated, that can be addressed by having a GET of the website URI get a
list (or pattern I hope) of claimed URIs.

But this opens the door to another kind of abuse, where the website
claims URIs which don't want to be claimed.  Someone visiting them
would see the claim refuted, but part of the idea of site metadata is
that visitors shouldn't have to visit every resource in the site.

I think what's needed is a rule like: "if a GET/HEAD on FOO says its
site is BAR, then BAR is authoritative about URIs starting with
'FOO'."   This still has a touch of the robots.txt hack in it, but I
haven't been able to think of anything better.

    -- sandro

Received on Saturday, 1 March 2003 07:29:49 UTC