W3C home > Mailing lists > Public > www-rdf-interest@w3.org > March 2004


From: Benja Fallenstein <b.fallenstein@gmx.de>
Date: Wed, 10 Mar 2004 13:52:00 +0200
Message-ID: <404F0160.7030709@gmx.de>
To: Patrick Stickler <patrick.stickler@nokia.com>
Cc: ext Phil Dawes <pdawes@users.sourceforge.net>, www-rdf-interest@w3.org

Patrick Stickler wrote:
>>> (2) it violates the principle of URI opacity
>> Is this a real-world problem? robots.txt violates the principal of
>> URI opacity, but still adds lots of value to the web.
> And it is frequently faulted, and alternatives actively discussed.
> In fact, now that you mention it, I see URIQA as an ideal replacement
> for robots.txt in that one can request a description of the root
> web authority base URI, e.g. 'http://example.com' and recieve a
> description of that site, which can define crawler policies in
> terms of RDF in a much more effective manner.

That would carry over one of the reasons why we need a replacement for 
robots.txt: that its notion of 'web site' is bad. If somebody maintains 
a website for some project at http://someuniversity/~name/projectname/, 
that site should be able to have e.g. robot exclusion information 
without convincing the university's web server admins or purchasing a 
domain name. See


The above proposes a Website: header containing an RDF URI. With URIQA, 
you could do an MGET on a page to discover its site, then do an MGET on 
that URI to find out about its robots policy. But doing an MGET on the 
root URI of the domain would be really flawed.

- Benja
Received on Wednesday, 10 March 2004 06:52:41 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:07:50 UTC