RE: robots.rdf from Massimo Marchiori on 2002-01-10 (www-rdf-interest@w3.org from January 2002)

From: Massimo Marchiori <massimo@w3.org>
Date: Wed, 9 Jan 2002 22:08:13 -0500
To: www-rdf-interest@w3.org
Message-Id: <200201100308.WAA20215@tux.w3.org>
<quote>
> Incidentally, since you brought the P3P thingy in, the smart way would
> be instead to stick any RDF you want in its well-known location (that
> has been designed just to allow this), so browsers like IE6 etc
> will just munch the metadata with a single GET.

(Oh, fun, let's have an argument!)

The smart thing is to *not* use well-known locations, but to follow an age
old tradition: if you want to know about a web site, *read its homepage*.

It works for machines as well as for people. The WKL location hack may be
a justifiable hint in some contexts, but in general its a bad thing. It is
not for W3C, the IETF or anyone to tell me what my URIs mean. I've paid
money for domain names in exchange for the ability to deploy URIs with
those names in the Web. I don't want to find out, perhaps years later,
that some WG have decided they know what http://danbri.example.com/p3p/ or
http://danbri.example.com/rdf/ are to be used for. I'm wary of a trend
towards WKRs because they encourage a view that says Working Groups can
set URI naming conventions.
</quote>


Dan, this is an argument that (fortunately for web architecture ;) doesn't 
hold, because starts from the hurried assumption/impression

"P3P's WKL" == "robots.txt location"

Instead, P3P's WKL works very differently, as it is part of a more
general framework to associate web objects with P3P policies metadata.
Without going too techie, the following simplified argument should explain 
things better: P3P's WKL is one of the possible ways to attach P3P's metadata.
Being an optional feature, this means that IF you put the (appropriately
namespaced) P3P information in the P3P's WKL, THEN a user agent can get the info there.

Now on to the paradox: 
<paradox>
"if you want to know about a web site, *read its homepage*" is just a kind
of P3P WKL, where the WKL is the home page (!). 
</paradox>

Back to your email:
<quote>
> Note this also relates to the sitemap thread (in fact, that's been one
> of the possible applications we had in mind).
Indeed it does. 
Being able to find a manifest or overview page for a site,
w/ pointers to associated web services, rss feeds, data dumps, site map
file(s), privacy statements etc etc is a worthy goal. But I'm having
trouble understanding the value of inventing WKRs beyond the published
home page URIs for these sites. Metadata could be embedded in the XHTML,
available by content negotiation, or linked to from home page. Or all three...
</quote>

In fact, there are deep reasons (mostly in the 80/20 category) on why the 
home page WKL is not enough:

1) CODOMAIN INDEPENDENCE 
every embedding is dependant on the target object type (<math>codomain</math>), 
which can be problematic, as the codomain might not be seamlessly extendable:
<historical-precedent>
Embedding RDF in HTML
</historical-precedent>

This fact is even more crucial for the home page WKL, which is usually too critical as 
far as visual layout is concerned. The complementary way is to fix the codomain, 
using your own format and, necessarily, your own path. For the "no free lunch" principle, 
what you earn (fixed format) you lose somewhere else (fixed path), so to mimimize 
the price to pay it's better to find a path that is seldom used in practice: this is what 
P3P's WKL does (choosing the /w3c/p3p.xml path ), resulting therefore 
independent on the codomain. 


2) INDIRECT EMBEDDING
A full *direct* embedding might not be possible or computationally too inefficient.  
So usually indirect embedding is used (eg HTML's link tag). Now consider efficiency 
in requesting a web object /foo and associated metadata:

When WKL==home page embedding:
 GET /foo
 GET /
 GET metadata linked in /
 ==> 3 GET's

When WKL== fixed_path
 GET /foo
 GET fixed_path
 ==> 2 GET's


Now, ending with the final:
<disclaimer>
To avoid a lengthy thread, note nothing in the above says this is *the* smart way
to embed metadata: other smart ways are possible, and have their corresponding
reasons and scope (reasons why eg. P3P doesn't limit itself to the WKL....;).
</disclaimer>

Related to what just said: 
<fun-homework>
Nobody AFAIK tried to shape up a doc/note with a rigorous high-level analysis of 
the metadata association problem. Take the above and extend the analysis to 
complete the picture as far as possible (partial hint: look at the P3P spec, hah ;)
</fun-homework>

Hope this email wasn't too dense (late bed time here!).
-M
Received on Wednesday, 9 January 2002 22:08:13 UTC