- From: James Leigh <james-nospam@leighnet.ca>
- Date: Fri, 05 Nov 2010 12:38:31 -0400
- To: Leigh Dodds <leigh.dodds@talis.com>
- Cc: William Waites <ww@styx.org>, Ian Davis <me@iandavis.com>, public-lod@w3.org
On Fri, 2010-11-05 at 10:03 +0000, Leigh Dodds wrote: > Hi, > > On 5 November 2010 09:54, William Waites <ww@styx.org> wrote: > > On Fri, Nov 05, 2010 at 09:34:43AM +0000, Leigh Dodds wrote: > > > Keeping this quantity of information around might quickly > > turn out to be too data-intensive to be practical, but > > that's more of an engineering question. I think it does > > make some sense to do this in principle at least. > > That's what I found when crawling the BBC pages. Huge amounts of data > and overhead in storing it. Capturing just enough to gather statistics > on the crawl was sufficient. > Hi all, Another point on this 303 vs 200 debate, wrt too many triples that I haven't yet seen, is the distinction between different RDF data. Not all RDF crawlers want to absorb all RDF data. Just because a server returns RDF does not mean the client must absorb it (or even read it). The 303/200 distinction can help the client distinguish between interesting RDF data and uninteresting RDF data. In some of the work I do, I create templates in RDF. Although these templates are RDF encoded, they contain no meaningful information on their own. These templates might be XHTML+RDFa, for example, containing blank nodes that will be filled in later with actual RDF IRIs. To distinguish these templates (in RDF) from general metadata (in RDF) the 200 vs 303 works well. This allows me to distinguish a graph pattern document from a description graph of something even though they are both in RDF. Phil Archer's 210 status code (http://philarcher.org/diary/303/) would also work it this case too. I will add that the 210 response would be even better if the client could ask for it using the expect header. Regards, James -- James Leigh Services Inc. http://www.leighnet.ca/ http://jamesrdf.blogspot.com/
Received on Friday, 5 November 2010 16:39:38 UTC