Re: Is 303 really necessary? from Nathan on 2010-11-05 (public-lod@w3.org from November 2010)

From: Nathan <nathan@webr3.org>
Date: Fri, 05 Nov 2010 10:05:36 +0000
To: Leigh Dodds <leigh.dodds@talis.com>
CC: Harry Halpin <hhalpin@ibiblio.org>, Ian Davis <me@iandavis.com>, public-lod@w3.org, Doug Schepers <schepers@w3.org>
Message-ID: <4CD3D6F0.2080000@webr3.org>

Leigh Dodds wrote:
> Hi Nathan,
> 
> On 4 November 2010 18:08, Nathan <nathan@webr3.org> wrote:
>> You see it's not about what we say, it's about what other say, and if 10
>>  huge corps analyse the web and spit out billions of triples saying
>> that anything 200 OK'd is a document, then at the end when we consider
>> the RDF graph of triples, all we're going to see is one statement saying
>> something is a "nonInformationResource" and a hundred others saying it's
>> a document and describing what it's about together with it's format and
>> so on.
> 
> Are you suggesting that Linked Data crawlers could/should look at the
> status code and use that to infer new statements about the resources
> returned? If so, I think that's the first time I've seen that
> mentioned, and am curious as to why someone would do it. Surely all of
> the useful information is in the data itself.

Not at all, I'm saying that if big-corp makes a /web crawler/ that 
describes what documents are about and publishes RDF triples, then if 
you use 200 OK, throughout the web you'll get (statements similar to) 
the following asserted:

   </toucan> :primaryTopic dbpedia:Toucan ; a :Document .

Now, move down the line a couple of years and reason over the a triple 
dump of the web-of-data and you'll find the problem, way to solve the 
problem is to first strip everything that's a :Document, so all the 
slash URIs will be stripped, including the </toucan>.

I'm also saying that 303 doesn't solve this half the time either, 
because most HTTP clients blackbox the process, so their process is:

   uri = "/toucan";
   doc = get( uri );
   makeStatements( uri , doc );

Again, same problem.

Best,

Nathan

Received on Friday, 5 November 2010 10:06:48 UTC