Re: New Calais proxy could grow Linked Data Cloud

Thomas Tague wrote:
> LOD Group:
Tom,

First off, welcome!

Really happy to see you've taken the time engage the LOD community re. 
your team's efforts and long term objectives.

Questions follow inline below.
>
> First, a philosophical point and then a few facts. 
>
> When your child first learns to read you don't discard that because 
> they haven't yet graduated from college. You know college is coming, 
> you're already thinking about college, you may actually be actively 
> working on college - but the first words are still important.
>
> Calais is learning to read. We firmly believe in releasing building 
> blocks when they become available rather than waiting (and waiting and 
> waiting) for the entire solution to be ready.
>
> A few specific facts to make it clearer where SemanticProxy fits in:
>
> 1) We will have de-referenceable URIs for every entity extracted by 
> Calais by the end of this year. The engineering is done and we're in 
> active design and build mode. We haven't finished the analysis yet - 
> but this will be millions of endpoints on the day we go live.
Please clarify what you mean by "endpoints".  Over here it might refer 
to SPARQL endpoints or derferencable URIs. I suspect you mean URIs, but 
clarification from you will aid others.
>
> 2) A *subset* of those entity types will absolutely have links to 
> other linked data sources when we go live. Right now we know there 
> will be substantive links for companies, geographies and a few of the 
> easy ones like music, books, etc. We'll expand on that set over time 
> and have a goal of setting up a community-based mechanism for 
> enhancing the links over time.

Will linkage apply to "instance data" and associated "definitions data" 
(ontology / schema / data dictionary) for the Thompson Reuters linked 
data spaces?
Will you be using shared ontologies where such exist, or at the very 
least put out your ontology in RDFS or OWL?  Even doing this  open up 
the doors for community participation in the data definitions linkage 
effort (e.g. what's happened re. UMBEL, OpenCyc, Yago, and Wordnet).
>
> 3) At the end of this month (September) as part of Release 3.1 we'll 
> be releasing company and geography disambiguation as a component of 
> the metadata generation process. The company disambiguation is based 
> on a lexicon of over 16M company aliases + additional hinting and we 
> have a similar approach with geography. 
>
Great news, but the real utility of such work will always be easier to 
imbibe, by this community in particular,  if the resulting output is an 
RDF Linked Data Space rather than an RDF Data Island :-)

Again, great to have you outline your development road-map here, I 
certainly believe this will ultimately be a great contribution to the 
burgeoning Linked Data Web.


Kingsley

> Question? Ideas? Fire away.
>
> Tom
>
>
> On Tue, Sep 23, 2008 at 8:21 AM, Paul Miller <Paul.Miller@talis.com 
> <mailto:Paul.Miller@talis.com>> wrote:
>
>     From the post...
>
>     "SemanticProxy will return dereferenceable Linked Data URIs by the
>     end of this quarter."
>
>     Paul
>
>     --
>     Paul Miller
>     Technology Evangelist, Talis
>     w: www.talis.com/ <http://www.talis.com/>  skype: napm1971
>     mobile/cell: +44 7769 740083
>
>     http://blogs.zdnet.com/semantic-web/
>
>     _www.linkedin.com/in/pau1mi11er
>     <http://www.linkedin.com/in/pau1mi11er>_
>
>
>
>
>     On 23 Sep 2008, at 13:02, Kingsley Idehen wrote:
>
>>     Paul Miller wrote:
>>>     Members of this list might be interested in my write-up of
>>>     ThomsonReuters' latest beta service... which I think will prove
>>>     pretty useful in growing the Linked Data cloud... especially for
>>>     news content from the BBC et al...
>>>
>>>     http://blogs.zdnet.com/semantic-web/?p=194
>>>
>>>     Paul
>>>
>>>     --
>>>     Paul Miller
>>>     Technology Evangelist, Talis
>>>     w: www.talis.com/ <http://www.talis.com/>
>>>     <http://www.talis.com/>  skype: napm1971
>>>     mobile/cell: +44 7769 740083
>>>
>>>     http://blogs.zdnet.com/semantic-web/
>>>
>>>     _www.linkedin.com/in/pau1mi11er
>>>     <http://www.linkedin.com/in/pau1mi11er>
>>>     <http://www.linkedin.com/in/pau1mi11er>_
>>>
>>>
>>>
>>>
>>     Paul,
>>
>>     How does this actually benefit or contribute to the Linked Data
>>     Cloud? I ask specifically because URIs (of the dereferencable
>>     variety) are missing in action. Hopefully, I am completely
>>     overlooking something here :-)
>>
>>     We tend to use the term "Proxy" or "Wrapper" to describe
>>     solutions in the Linked Data realm that generate dereferencable
>>     URIs based RDF graphs  (aka. Linked Data Spaces) "on the fly" via
>>     RDF-ization middleware.
>>
>>     If possible, please encourage the OpenCalais folks (Tom et al.)
>>     to respond to my comments above via a response to this post.
>>
>>     Example Proxy / Wrapper URIs in the wild:
>>
>>     1.
>>     http://demo.openlinksw.com/proxy/html/http://www.freebase.com/view/en/abraham_lincoln
>>     - Document about Abraham Lincoln
>>     2.
>>     http://demo.openlinksw.com/about/html/http://demo.openlinksw.com/about/rdf/http://www.freebase.com/view/en/abraham_lincoln%23this
>>      - Abraham Lincoln the Entity of type foaf:Person that is also a
>>     sioc:Item
>>     3.
>>     http://demo.openlinksw.com/about/html/http://demo.openlinksw.com/about/rdf/http://www.crunchbase.com/company/thomson-reuters%23this
>>     - Thompson Reuters the Entity of type foaf:Organization that is
>>     also a sioc:Item
>>     4.
>>     http://www4.wiwiss.fu-berlin.de/flickrwrappr/photos/Thomson_Reuters
>>     - Thompson Reuters photos from Flickr
>>     5.
>>     http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fflickrwrappr%2Fphotos%2FThomson_Reuters
>>     - Browser view of the data space exposed by the Flickr wrapper URI
>>
>>
>>
>>     -- 
>>
>>
>>     Regards,
>>
>>     Kingsley Idehen      Weblog: http://www.openlinksw.com/blog/~kidehen
>>     President & CEO OpenLink Software     Web: http://www.openlinksw.com
>>
>>
>>
>>
>
>


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com

Received on Tuesday, 23 September 2008 14:22:49 UTC