Re: Is 303 really necessary? from Nathan on 2010-11-04 (public-lod@w3.org from November 2010)

From: Nathan <nathan@webr3.org>
Date: Thu, 04 Nov 2010 18:46:58 +0000
To: Ian Davis <me@iandavis.com>
CC: Harry Halpin <hhalpin@ibiblio.org>, public-lod@w3.org, Doug Schepers <schepers@w3.org>
Message-ID: <4CD2FFA2.9080700@webr3.org>

Ian Davis wrote:
> On Thu, Nov 4, 2010 at 6:08 PM, Nathan <nathan@webr3.org> wrote:
>> You see it's not about what we say, it's about what other say, and if 10
>>  huge corps analyse the web and spit out billions of triples saying that
>> anything 200 OK'd is a document, then at the end when we consider the RDF
>> graph of triples, all we're going to see is one statement saying something
>> is a "nonInformationResource" and a hundred others saying it's a document
>> and describing what it's about together with it's format and so on.
>>
>> I honestly can't see how anything could reason over a graph that looked like
>> that.
> 
> I honestly believe that's the least of our worries. How often do you
> need to determine whether something in the universe of discourse is an
> electronic document or not compared with all the other questions you
> might be asking of your data. I might conceivable ask "show me all the
> documents about this toucan" but I'd much rather ask "show me all the
> data about this toucan"

I think we all would, but we'd also like to see the data about this 
toucan rather than about this toucan and the document that describes it.

To be clear, the issue is not </toucan> ex:isDescribedBy </doc>

The issue is </toucan> ex:isDescribedBy </toucan>

And when you 200 OK, that's what you'll get in your graph. TBH with any 
slash URI it's probably what you'll end up getting.

>> However, I'm also very aware that this all may be moot any ways, because
>> many crawlers and HTTP agents just treat HTTP like a big black box, they
>> don't know there ever was a 303 and don't know what the end URI is (even
>> major browser vendors like chrome do this, setting the base wrong and
>> everything) - so even the current 303 pattern doesn't keep different things
>> with different names for /slash URIs in all cases.
>>
> 
> That's true. I don't suppose any of the big crawlers care about the
> semantics of 303 because none of them care about the difference
> between a thing and its description. For example the Google OpenSocial
> doesn't give a hoot about the difference and yet seems to still
> function. As I say above, this document/thing distinction is actually
> quite small area to focus on compared with the the real problems of
> analysing the web of data as a whole.

Well yeah, one could take the entire graph, stick it in a triple store, 
and look then strip all triples which can be inferred as having a class 
Document. To be left with just the data :) [ which obviously won't 
include your /toucan /doc or /anna ]

Best,

Nathan

Received on Thursday, 4 November 2010 18:48:06 UTC