robots.rdf

With respect to the various approaches for embedding or linking rdf data
from pages on the web, I was wondering if the robots exclusion protocol
could be leveraged to make alife easier for rdf-aware agents, in a way that
would be a lot less effort than going for something like full-blown P3P.
There are two ways that I am aware of the protocol being used at present -
either in a metatag (e.g. <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">)
or in a robots.txt file in the root directory of the server. The former
couldn't really add much of value to the existing situation, but the latter
might have a lot of potential. If robots.txt contained information
specifically aimed at rdf agents, then a lot of the ad hoc link
following/metatag scrunching that might otherwise have to be employed by
these agents wouldn't be necessary. An approach that springs to mind would
be to simply provide a reference in the robots.txt to an rdf file containing
site-wide information regarding the way in which metadata has been deployed
(or just provide robots.rdf alongside robots.txt and let the agent find it -
though I think it would be more appropriate to use the more standard file to
provide at least a hint), something along the lines of :

User-agent: rdf-xml
Allow: /robots.rdf

Though the primary gain of using a system like this would be to let the
agent know what kind of metadata system(s) is in use on the site, there
might be a lot of other potential benefits with site-wide metadata specified
in a manner such as this, e.g. it could contain (subject to a suitable vocab
being available) a list of namespace abbreviation mappings, which could be
used to systematically overload the use of meta tags until HTML 6.0 becomes
available.

Cheers,
Danny.


---

Danny Ayers
<stuff> http://www.isacat.net </stuff>

Alternate email (2001) :
danny666@virgilio.it
danny_ayers@yahoo.co.uk

Received on Monday, 7 January 2002 05:07:49 UTC