Re: Proposed issue: site metadata hook from Chris Lilley on 2003-02-19 (www-tag@w3.org from February 2003)

From: Chris Lilley <chris@w3.org>
Date: Wed, 19 Feb 2003 13:12:38 +0100
To: Patrick.Stickler@nokia.com
CC: www-tag@w3.org, timbl@w3.org
Message-ID: <81169010609.20030219131238@w3.org>
On Tuesday, February 18, 2003, 7:19:20 AM, Patrick wrote:

PSnc> It seems we are talking past each other.

It would seem so.

PSnc> I'm going to suggest that we both are in favor of the architecture
PSnc> *allowing* all users to be able to control their own personal web
PSnc> spaces, even when they do not own the server.

I can agree to that.

PSnc> But that the architecture itself does not mandate specific rights
PSnc> of control for all users against the wishes of the server owner.

PSnc> Thus, if the server owner wishes to allow user-specific control,
PSnc> the architecture should take that into consideration, and support
PSnc> that level of resolution.

PSnc> But the architecuture should not permit users to circumvent the
PSnc> explicit wishes of the server owner.

PSnc> Yes?

That seems a good summary. I particularly liked 'explicit wishes' as
opposed to 'assumed wishes'. In other words if the server owner wants
to deny user-specific control they need to say so explicitly.

>> Lets consider an architecture where the corporation owns / 
>> and accounting
>> owns /corporate/accounting and marketing owns /comm/pr
>> 
>> Lets assume that the corporation decides that it does not want /
>> crawled. Lets assume that marketing wants /comm/pr crawled.

PSnc> Then I would say too bad for /comm/pr. If the owner says "this
PSnc> server will not be crawled" then it shouldn't, no matter what
PSnc> any user says.

My point was that the corporation as a whole was expressing wishes
about a pruned tree (minus leaves or leaf subtrees that are delegated
elsewhere).

Currently its not really possible to express the difference between
'this whole (sub) tree' and 'this whole(sub_tree up to the point where
the rules change.


PSnc> HOWEVER, if the corporation is saying "only areas explicitly 
PSnc> specified to be crawled, by the users responsible for those
PSnc> areas, may be crawled" then that is something different.

Right.

PSnc> I understand (now a bit better) that what you are asking for
PSnc> is the architecture to be able to allow users to express their
PSnc> wishes over their own content, and for robots to take that
PSnc> information into account *IFF* the server owner permits it to.
PSnc> (it's the IFF I thought you were leaving out...)

Exactly.

PSnc> But that the present architecture is too coarse to allow for
PSnc> efficient management of user-specific wishes in that regard
PSnc> and thus needs to be refined.

PSnc> Right?

Correct.

PSnc> A specific question to help me determine that: If the server owner
PSnc> says "no crawlers at all on this server" and a tenant says "all my
PSnc> own content can be crawled", should the tenant's content be crawled?

See above regarding current poverty of expressiveness. The same
language has to do double duty to keep crawlers off rootwards areas,
and all areas.

>> And you would do that how?

PSnc> By obtaining and inspecting the RDF description of the site,
PSnc> examining those properties that describe robot behavior and
PSnc> recursively obtaining and inspecting the RDF descriptions of
PSnc> whatever resources are relevant to answering the question,
PSnc> including the description about the particular user space, etc.

Ok so this now brings us on to the area of efficiency. See Apache
.htaccess files and CERN .meta directories for similar problems.

It is undesirable if, to find the metadata for an area n steps from
the root, I need to make n requests (in either direction, root towards
area or area towards root).

>> PSnc> Why do we need anything more than the semantic web extensions
>> PSnc> to the present web architecture
>> 
>> If I knew clearly what those were then I might be able to answer you.
>> But at present there does not seem to be a list of them.

PSnc> They have been mentioned repeatedly in this very thread:

Ok those are proposals. They don't exist yet, and there is no clear
specification of them so they are difficult to discuss.

PSnc> MGET {URI}     returns an RDF description of the resource denoted by the URI
PSnc> MPUT {URI}     adds statements to the knowledge base describing the resource
PSnc> MDELETE {URI}  removes statements from the knowledge base describing the resource
PSnc> MUPDATE {URI}  replaces/adds statements to the knowledge base describing the resource



-- 
 Chris                            mailto:chris@w3.org
Received on Wednesday, 19 February 2003 08:48:30 UTC