Re: New Calais proxy could grow Linked Data Cloud

Hi Tom,

That's really great news. This will make a huge difference to the web
of data; am even more impressed than before. Apologies for ever
doubting you! ;-)

Tom.

On 23/09/2008, Kingsley Idehen <kidehen@openlinksw.com> wrote:
> Thomas Tague wrote:
>> LOD Group:
> Tom,
>
> First off, welcome!
>
> Really happy to see you've taken the time engage the LOD community re.
> your team's efforts and long term objectives.
>
> Questions follow inline below.
>>
>> First, a philosophical point and then a few facts.
>>
>> When your child first learns to read you don't discard that because
>> they haven't yet graduated from college. You know college is coming,
>> you're already thinking about college, you may actually be actively
>> working on college - but the first words are still important.
>>
>> Calais is learning to read. We firmly believe in releasing building
>> blocks when they become available rather than waiting (and waiting and
>> waiting) for the entire solution to be ready.
>>
>> A few specific facts to make it clearer where SemanticProxy fits in:
>>
>> 1) We will have de-referenceable URIs for every entity extracted by
>> Calais by the end of this year. The engineering is done and we're in
>> active design and build mode. We haven't finished the analysis yet -
>> but this will be millions of endpoints on the day we go live.
> Please clarify what you mean by "endpoints".  Over here it might refer
> to SPARQL endpoints or derferencable URIs. I suspect you mean URIs, but
> clarification from you will aid others.
>>
>> 2) A *subset* of those entity types will absolutely have links to
>> other linked data sources when we go live. Right now we know there
>> will be substantive links for companies, geographies and a few of the
>> easy ones like music, books, etc. We'll expand on that set over time
>> and have a goal of setting up a community-based mechanism for
>> enhancing the links over time.
>
> Will linkage apply to "instance data" and associated "definitions data"
> (ontology / schema / data dictionary) for the Thompson Reuters linked
> data spaces?
> Will you be using shared ontologies where such exist, or at the very
> least put out your ontology in RDFS or OWL?  Even doing this  open up
> the doors for community participation in the data definitions linkage
> effort (e.g. what's happened re. UMBEL, OpenCyc, Yago, and Wordnet).
>>
>> 3) At the end of this month (September) as part of Release 3.1 we'll
>> be releasing company and geography disambiguation as a component of
>> the metadata generation process. The company disambiguation is based
>> on a lexicon of over 16M company aliases + additional hinting and we
>> have a similar approach with geography.
>>
> Great news, but the real utility of such work will always be easier to
> imbibe, by this community in particular,  if the resulting output is an
> RDF Linked Data Space rather than an RDF Data Island :-)
>
> Again, great to have you outline your development road-map here, I
> certainly believe this will ultimately be a great contribution to the
> burgeoning Linked Data Web.
>
>
> Kingsley
>
>> Question? Ideas? Fire away.
>>
>> Tom
>>
>>
>> On Tue, Sep 23, 2008 at 8:21 AM, Paul Miller <Paul.Miller@talis.com
>> <mailto:Paul.Miller@talis.com>> wrote:
>>
>>     From the post...
>>
>>     "SemanticProxy will return dereferenceable Linked Data URIs by the
>>     end of this quarter."
>>
>>     Paul
>>
>>     --
>>     Paul Miller
>>     Technology Evangelist, Talis
>>     w: www.talis.com/ <http://www.talis.com/>  skype: napm1971
>>     mobile/cell: +44 7769 740083
>>
>>     http://blogs.zdnet.com/semantic-web/
>>
>>     _www.linkedin.com/in/pau1mi11er
>>     <http://www.linkedin.com/in/pau1mi11er>_
>>
>>
>>
>>
>>     On 23 Sep 2008, at 13:02, Kingsley Idehen wrote:
>>
>>>     Paul Miller wrote:
>>>>     Members of this list might be interested in my write-up of
>>>>     ThomsonReuters' latest beta service... which I think will prove
>>>>     pretty useful in growing the Linked Data cloud... especially for
>>>>     news content from the BBC et al...
>>>>
>>>>     http://blogs.zdnet.com/semantic-web/?p=194
>>>>
>>>>     Paul
>>>>
>>>>     --
>>>>     Paul Miller
>>>>     Technology Evangelist, Talis
>>>>     w: www.talis.com/ <http://www.talis.com/>
>>>>     <http://www.talis.com/>  skype: napm1971
>>>>     mobile/cell: +44 7769 740083
>>>>
>>>>     http://blogs.zdnet.com/semantic-web/
>>>>
>>>>     _www.linkedin.com/in/pau1mi11er
>>>>     <http://www.linkedin.com/in/pau1mi11er>
>>>>     <http://www.linkedin.com/in/pau1mi11er>_
>>>>
>>>>
>>>>
>>>>
>>>     Paul,
>>>
>>>     How does this actually benefit or contribute to the Linked Data
>>>     Cloud? I ask specifically because URIs (of the dereferencable
>>>     variety) are missing in action. Hopefully, I am completely
>>>     overlooking something here :-)
>>>
>>>     We tend to use the term "Proxy" or "Wrapper" to describe
>>>     solutions in the Linked Data realm that generate dereferencable
>>>     URIs based RDF graphs  (aka. Linked Data Spaces) "on the fly" via
>>>     RDF-ization middleware.
>>>
>>>     If possible, please encourage the OpenCalais folks (Tom et al.)
>>>     to respond to my comments above via a response to this post.
>>>
>>>     Example Proxy / Wrapper URIs in the wild:
>>>
>>>     1.
>>>
>>> http://demo.openlinksw.com/proxy/html/http://www.freebase.com/view/en/abraham_lincoln
>>>     - Document about Abraham Lincoln
>>>     2.
>>>
>>> http://demo.openlinksw.com/about/html/http://demo.openlinksw.com/about/rdf/http://www.freebase.com/view/en/abraham_lincoln%23this
>>>      - Abraham Lincoln the Entity of type foaf:Person that is also a
>>>     sioc:Item
>>>     3.
>>>
>>> http://demo.openlinksw.com/about/html/http://demo.openlinksw.com/about/rdf/http://www.crunchbase.com/company/thomson-reuters%23this
>>>     - Thompson Reuters the Entity of type foaf:Organization that is
>>>     also a sioc:Item
>>>     4.
>>>     http://www4.wiwiss.fu-berlin.de/flickrwrappr/photos/Thomson_Reuters
>>>     - Thompson Reuters photos from Flickr
>>>     5.
>>>
>>> http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fflickrwrappr%2Fphotos%2FThomson_Reuters
>>>     - Browser view of the data space exposed by the Flickr wrapper URI
>>>
>>>
>>>
>>>     --
>>>
>>>
>>>     Regards,
>>>
>>>     Kingsley Idehen      Weblog: http://www.openlinksw.com/blog/~kidehen
>>>     President & CEO OpenLink Software     Web: http://www.openlinksw.com
>>>
>>>
>>>
>>>
>>
>>
>
>
> --
>
>
> Regards,
>
> Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO
> OpenLink Software     Web: http://www.openlinksw.com
>
> Find out more about Talis at  www.talis.com
> Shared InnovationTM
>
>
> Any views or personal opinions expressed within this email may not be those
> of Talis Information Ltd. The content of this email message and any files
> that may be attached are confidential, and for the usage of the intended
> recipient only. If you are not the intended recipient, then please return
> this message to the sender and delete it. Any use of this e-mail by an
> unauthorised recipient is prohibited.
>
>
> Talis Information Ltd is a member of the Talis Group of companies and is
> registered in England No 3638278 with its registered office at Knights
> Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
>

Received on Tuesday, 23 September 2008 17:30:10 UTC