Re: New Calais proxy could grow Linked Data Cloud

My comments in-line.
TT


On Tue, Sep 23, 2008 at 10:22 AM, Kingsley Idehen <kidehen@openlinksw.com>wrote:

> Thomas Tague wrote:
>
>> LOD Group:
>>
> Tom,
>
> First off, welcome!
>
> Really happy to see you've taken the time engage the LOD community re. your
> team's efforts and long term objectives.
>
> Questions follow inline below.
>
>>
>> First, a philosophical point and then a few facts.
>> When your child first learns to read you don't discard that because they
>> haven't yet graduated from college. You know college is coming, you're
>> already thinking about college, you may actually be actively working on
>> college - but the first words are still important.
>>
>> Calais is learning to read. We firmly believe in releasing building blocks
>> when they become available rather than waiting (and waiting and waiting) for
>> the entire solution to be ready.
>>
>> A few specific facts to make it clearer where SemanticProxy fits in:
>>
>> 1) We will have de-referenceable URIs for every entity extracted by Calais
>> by the end of this year. The engineering is done and we're in active design
>> and build mode. We haven't finished the analysis yet - but this will be
>> millions of endpoints on the day we go live.
>>
> Please clarify what you mean by "endpoints".  Over here it might refer to
> SPARQL endpoints or derferencable URIs. I suspect you mean URIs, but
> clarification from you will aid others.


>> By endpoints I do mean de-referenceable URI's for the entities. We're not
quite ready to host SPARQL endpoints right now.


>
>> 2) A *subset* of those entity types will absolutely have links to other
>> linked data sources when we go live. Right now we know there will be
>> substantive links for companies, geographies and a few of the easy ones like
>> music, books, etc. We'll expand on that set over time and have a goal of
>> setting up a community-based mechanism for enhancing the links over time.
>>
>
> Will linkage apply to "instance data" and associated "definitions data"
> (ontology / schema / data dictionary) for the Thompson Reuters linked data
> spaces?
> Will you be using shared ontologies where such exist, or at the very least
> put out your ontology in RDFS or OWL?  Even doing this  open up the doors
> for community participation in the data definitions linkage effort (e.g.
> what's happened re. UMBEL, OpenCyc, Yago, and Wordnet).


>> We'll be publishing our ontology in the near future as RDFS. In it's
initial release it will have some modest linking to other ontologies - again
we hope this can be a community-based effort in the future.


>
>
>> 3) At the end of this month (September) as part of Release 3.1 we'll be
>> releasing company and geography disambiguation as a component of the
>> metadata generation process. The company disambiguation is based on a
>> lexicon of over 16M company aliases + additional hinting and we have a
>> similar approach with geography.
>
>
>> The output is a URI, which is de-referenceable, and in a growing number
of entities will point to other linked data resources.


>
>>  Great news, but the real utility of such work will always be easier to
> imbibe, by this community in particular,  if the resulting output is an RDF
> Linked Data Space rather than an RDF Data Island :-)
>
> Again, great to have you outline your development road-map here, I
> certainly believe this will ultimately be a great contribution to the
> burgeoning Linked Data Web.
>
>
> Kingsley
>
>  Question? Ideas? Fire away.
>>
>> Tom
>>
>>
>> On Tue, Sep 23, 2008 at 8:21 AM, Paul Miller <Paul.Miller@talis.com<mailto:
>> Paul.Miller@talis.com>> wrote:
>>
>>    From the post...
>>
>>    "SemanticProxy will return dereferenceable Linked Data URIs by the
>>    end of this quarter."
>>
>>    Paul
>>
>>    --
>>    Paul Miller
>>    Technology Evangelist, Talis
>>    w: www.talis.com/ <http://www.talis.com/>  skype: napm1971
>>    mobile/cell: +44 7769 740083
>>
>>    http://blogs.zdnet.com/semantic-web/
>>
>>    _www.linkedin.com/in/pau1mi11er
>>    <http://www.linkedin.com/in/pau1mi11er>_
>>
>>
>>
>>
>>    On 23 Sep 2008, at 13:02, Kingsley Idehen wrote:
>>
>>     Paul Miller wrote:
>>>
>>>>    Members of this list might be interested in my write-up of
>>>>    ThomsonReuters' latest beta service... which I think will prove
>>>>    pretty useful in growing the Linked Data cloud... especially for
>>>>    news content from the BBC et al...
>>>>
>>>>    http://blogs.zdnet.com/semantic-web/?p=194
>>>>
>>>>    Paul
>>>>
>>>>    --
>>>>    Paul Miller
>>>>    Technology Evangelist, Talis
>>>>    w: www.talis.com/ <http://www.talis.com/>
>>>>    <http://www.talis.com/>  skype: napm1971
>>>>    mobile/cell: +44 7769 740083
>>>>
>>>>    http://blogs.zdnet.com/semantic-web/
>>>>
>>>>    _www.linkedin.com/in/pau1mi11er
>>>>    <http://www.linkedin.com/in/pau1mi11er>
>>>>    <http://www.linkedin.com/in/pau1mi11er>_
>>>>
>>>>
>>>>
>>>>
>>>>     Paul,
>>>
>>>    How does this actually benefit or contribute to the Linked Data
>>>    Cloud? I ask specifically because URIs (of the dereferencable
>>>    variety) are missing in action. Hopefully, I am completely
>>>    overlooking something here :-)
>>>
>>>    We tend to use the term "Proxy" or "Wrapper" to describe
>>>    solutions in the Linked Data realm that generate dereferencable
>>>    URIs based RDF graphs  (aka. Linked Data Spaces) "on the fly" via
>>>    RDF-ization middleware.
>>>
>>>    If possible, please encourage the OpenCalais folks (Tom et al.)
>>>    to respond to my comments above via a response to this post.
>>>
>>>    Example Proxy / Wrapper URIs in the wild:
>>>
>>>    1.
>>>
>>> http://demo.openlinksw.com/proxy/html/http://www.freebase.com/view/en/abraham_lincoln
>>>    - Document about Abraham Lincoln
>>>    2.
>>>
>>> http://demo.openlinksw.com/about/html/http://demo.openlinksw.com/about/rdf/http://www.freebase.com/view/en/abraham_lincoln%23this
>>>     - Abraham Lincoln the Entity of type foaf:Person that is also a
>>>    sioc:Item
>>>    3.
>>>
>>> http://demo.openlinksw.com/about/html/http://demo.openlinksw.com/about/rdf/http://www.crunchbase.com/company/thomson-reuters%23this
>>>    - Thompson Reuters the Entity of type foaf:Organization that is
>>>    also a sioc:Item
>>>    4.
>>>    http://www4.wiwiss.fu-berlin.de/flickrwrappr/photos/Thomson_Reuters
>>>    - Thompson Reuters photos from Flickr
>>>    5.
>>>
>>> http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fflickrwrappr%2Fphotos%2FThomson_Reuters
>>>    - Browser view of the data space exposed by the Flickr wrapper URI
>>>
>>>
>>>
>>>    --
>>>
>>>    Regards,
>>>
>>>    Kingsley Idehen      Weblog: http://www.openlinksw.com/blog/~kidehen
>>>    President & CEO OpenLink Software     Web: http://www.openlinksw.com
>>>
>>>
>>>
>>>
>>>
>>
>>
>
> --
>
>
> Regards,
>
> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO OpenLink Software     Web: http://www.openlinksw.com
>
>
>
>
>

Received on Wednesday, 24 September 2008 10:03:11 UTC