Re: Inferring data from network interactions (was Re: Is 303 really necessary?)

On Fri, 2010-11-05 at 10:03 +0000, Leigh Dodds wrote: 
> Hi,
> 
> On 5 November 2010 09:54, William Waites <ww@styx.org> wrote:
> > On Fri, Nov 05, 2010 at 09:34:43AM +0000, Leigh Dodds wrote:
> 
> > Keeping this quantity of information around might quickly
> > turn out to be too data-intensive to be practical, but
> > that's more of an engineering question. I think it does
> > make some sense to do this in principle at least.
> 
> That's what I found when crawling the BBC pages. Huge amounts of data
> and overhead in storing it. Capturing just enough to gather statistics
> on the crawl was sufficient.
> 

Hi all,

Another point on this 303 vs 200 debate, wrt too many triples that I
haven't yet seen, is the distinction between different RDF data. Not all
RDF crawlers want to absorb all RDF data. Just because a server returns
RDF does not mean the client must absorb it (or even read it). The
303/200 distinction can help the client distinguish between interesting
RDF data and uninteresting RDF data.

In some of the work I do, I create templates in RDF. Although these
templates are RDF encoded, they contain no meaningful information on
their own. These templates might be XHTML+RDFa, for example, containing
blank nodes that will be filled in later with actual RDF IRIs. To
distinguish these templates (in RDF) from general metadata (in RDF) the
200 vs 303 works well. This allows me to distinguish a graph pattern
document from a description graph of something even though they are both
in RDF.

Phil Archer's 210 status code (http://philarcher.org/diary/303/) would
also work it this case too. I will add that the 210 response would be
even better if the client could ask for it using the expect header.

Regards,
James
-- 
James Leigh Services Inc.
http://www.leighnet.ca/
http://jamesrdf.blogspot.com/

Received on Friday, 5 November 2010 16:39:38 UTC