Crawlers need content negotiation, not! was: Re: URL +1, LSID -1 from Alan Ruttenberg on 2007-07-16 (public-semweb-lifesci@w3.org from July 2007)

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Mon, 16 Jul 2007 00:53:16 -0400
To: Eric Jain <Eric.Jain@isb-sib.ch>
Cc: wangxiao@musc.edu, Michel_Dumontier <Michel_Dumontier@carleton.ca>, public-semweb-lifesci <public-semweb-lifesci@w3.org>, Mark Wilkinson <markw@illuminae.com>, Benjamin Good <goodb@interchange.ubc.ca>, Natalia Villanueva Rosales <naty.vr@gmail.com>
Message-Id: <225A9C97-11AA-4301-85C1-46F7F7B7B6CE@gmail.com>

On Jul 15, 2007, at 1:53 PM, Eric Jain wrote:

>> IMO, the first goal of our design ought to be to ensure that  
>> automated semantic web agents (idiots as they will be) will have a  
>> fighting chance to avoid having to do the difficult (even  
>> impossible) sorts of disambiguations that people are faced with  
>> all the time. That bar hasn't yet been met. Once we've ensured  
>> that we can meet that goal, then we can talk about optimization.  
>> (incidentally we do discuss various optimization techniques, from  
>> predicability of the form of the name, to purl servers sending  
>> back the rewrite rules they use so that they can be implemented on  
>> the client side).
>
> There are people doing Semantic Web crawlers now, for them, I  
> gather, being able to get the RDF representation directly isn't a  
> premature optimization!

Except this isn't an issue. A link in the html suffices to let them  
know where the RDF is, and the extra retrieval isn't going to kill  
them. There are plenty of alternatives for optimization (google's  
site map file comes to mind, or the LINK: http header) that are not  
prone to unnecessarily introducing avoidable ambiguity on the  
semantic web.

-Alan

Received on Monday, 16 July 2007 04:53:23 UTC