- From: Leigh Dodds <leigh.dodds@talis.com>
- Date: Fri, 5 Nov 2010 10:03:43 +0000
- To: William Waites <ww@styx.org>
- Cc: nathan@webr3.org, Harry Halpin <hhalpin@ibiblio.org>, Ian Davis <me@iandavis.com>, public-lod@w3.org, Doug Schepers <schepers@w3.org>
Hi, On 5 November 2010 09:54, William Waites <ww@styx.org> wrote: > On Fri, Nov 05, 2010 at 09:34:43AM +0000, Leigh Dodds wrote: >> >> Are you suggesting that Linked Data crawlers could/should look at the >> status code and use that to infer new statements about the resources >> returned? If so, I think that's the first time I've seen that >> mentioned, and am curious as to why someone would do it. Surely all of >> the useful information is in the data itself. > > Provenance and debugging. It would be quite possible to > record the fact that this set of triples, G, were obtained > by dereferencing this uri N, at a certain time, from a > certain place, with a request that looked like this and a > response that had these headers and response code. The > class of information that is kept for [0]. If N appeared > in G, that could lead directly to inferences involving the > provenance information. If later reasoning is concerned at > all with the trustworthiness or up-to-dateness of the > data it could look at this as well. Yes, I've done something similar to that in the past when I added support for the ScutterVocab [1] to my crawler It was the suggestion that inferring information directly from 200/303 that I was most curious about. I've argued for inferring data from 301 in the past [2], but wasn't sure of merit of introducing data based on the other interactions > Keeping this quantity of information around might quickly > turn out to be too data-intensive to be practical, but > that's more of an engineering question. I think it does > make some sense to do this in principle at least. That's what I found when crawling the BBC pages. Huge amounts of data and overhead in storing it. Capturing just enough to gather statistics on the crawl was sufficient. Cheers, L. [1]. http://wiki.foaf-project.org/w/ScutterVocab [2]. http://www.ldodds.com/blog/2007/03/the-semantics-of-301-moved-permanently/ -- Leigh Dodds Programme Manager, Talis Platform Talis leigh.dodds@talis.com http://www.talis.com
Received on Friday, 5 November 2010 10:04:19 UTC