- From: Eric Jain <Eric.Jain@isb-sib.ch>
- Date: Mon, 16 Jul 2007 11:16:57 +0200
- To: Alan Ruttenberg <alanruttenberg@gmail.com>
- CC: wangxiao@musc.edu, Michel_Dumontier <Michel_Dumontier@carleton.ca>, public-semweb-lifesci <public-semweb-lifesci@w3.org>, Mark Wilkinson <markw@illuminae.com>, Benjamin Good <goodb@interchange.ubc.ca>, Natalia Villanueva Rosales <naty.vr@gmail.com>
Alan Ruttenberg wrote: > Except this isn't an issue. A link in the html suffices to let them know > where the RDF is, and the extra retrieval isn't going to kill them. There are something like 30M RDF documents on http://beta.uniprot.org/ alone. If for each document you have to retrieve and parse a web page first, that more than doubles the number of requests (and data volume)! > There are plenty of alternatives for optimization (google's site map > file comes to mind, or the LINK: http header) that are not prone to > unnecessarily introducing avoidable ambiguity on the semantic web. The people working on http://www.sindice.com/ have proposed a site map extension for optimizing crawling, see http://purl.uniprot.org/sitemap.xml. The Link header sounds like a good idea (never heard of that before), but at the moment it seems simpler for someone who wants to get only RDF documents to set an Accept header. This will also ensure that you are not redirected (and waste a request) for a resource that doesn't even have RDF.
Received on Monday, 16 July 2007 09:17:21 UTC