- From: Michael Brunnbauer <brunni@netestate.de>
- Date: Sat, 26 Jul 2014 00:31:19 +0200
- To: Kingsley Idehen <kidehen@openlinksw.com>
- Cc: public-lod@w3.org
- Message-ID: <20140725223119.GA31673@netestate.de>
Hello Kingsley, On Fri, Jul 25, 2014 at 05:47:58PM -0400, Kingsley Idehen wrote: > When you have a sense of the identity of an Agent and on behalf of whom it > is operating, you can use RDF based Linked Data to construct and enforce > usage policies. <sarcasm> Yes. Every "Agent" that does not use WebID-TLS supporting every possible RDF serialization and every access ontology that comes to mind does not deserve that name. </sarcasm> Seriously: It's funny that Charlie Stross - one of my favorite Science Fiction authors - was involved in the creation of the robots exclusion standard. But the "standard" is really a proprietary mess. Even basic things like "Crawl-Delay" are extensions introduced and supported by some vendors. Many current robots.txt libraries only check for allowed/forbidden and do not support parsing/returning such options. For starters, we need: -Current extensions made official -A means to exclude fragments of a [HTML] document for indexing -A Noindex HTTP Header to selectively exclude content from indexing without bloating the robots.txt (there is an inofficial x-robots-tag supported by Google and Bing) The former two would alleviate problems with the "right to be forgotten". And possibly something to distinguish occasional Agents from recursively crawling bots. My current interpretation of robots.txt is that it forbids every access not directly caused/mediated by a human. Regards, Michael Brunnbauer -- ++ Michael Brunnbauer ++ netEstate GmbH ++ Geisenhausener Straße 11a ++ 81379 München ++ Tel +49 89 32 19 77 80 ++ Fax +49 89 32 19 77 89 ++ E-Mail brunni@netestate.de ++ http://www.netestate.de/ ++ ++ Sitz: München, HRB Nr.142452 (Handelsregister B München) ++ USt-IdNr. DE221033342 ++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer ++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel
Received on Friday, 25 July 2014 22:31:46 UTC