Re: New Calais proxy could grow Linked Data Cloud

LOD Group:
First, a philosophical point and then a few facts.

When your child first learns to read you don't discard that because they
haven't yet graduated from college. You know college is coming, you're
already thinking about college, you may actually be actively working on
college - but the first words are still important.

Calais is learning to read. We firmly believe in releasing building blocks
when they become available rather than waiting (and waiting and waiting) for
the entire solution to be ready.

A few specific facts to make it clearer where SemanticProxy fits in:

1) We will have de-referenceable URIs for every entity extracted by Calais
by the end of this year. The engineering is done and we're in active design
and build mode. We haven't finished the analysis yet - but this will be
millions of endpoints on the day we go live.

2) A *subset* of those entity types will absolutely have links to other
linked data sources when we go live. Right now we know there will be
substantive links for companies, geographies and a few of the easy ones like
music, books, etc. We'll expand on that set over time and have a goal of
setting up a community-based mechanism for enhancing the links over time.

3) At the end of this month (September) as part of Release 3.1 we'll be
releasing company and geography disambiguation as a component of the
metadata generation process. The company disambiguation is based on a
lexicon of over 16M company aliases + additional hinting and we have a
similar approach with geography.

Question? Ideas? Fire away.

Tom


On Tue, Sep 23, 2008 at 8:21 AM, Paul Miller <Paul.Miller@talis.com> wrote:

> From the post...
>
> "SemanticProxy will return dereferenceable Linked Data URIs by the end of
> this quarter."
> Paul
>
> --
> Paul Miller
> Technology Evangelist, Talis
> w: www.talis.com/  skype: napm1971
> mobile/cell: +44 7769 740083
>
> http://blogs.zdnet.com/semantic-web/
> *www.linkedin.com/in/pau1mi11er*
>
>
>
>
> On 23 Sep 2008, at 13:02, Kingsley Idehen wrote:
>
> Paul Miller wrote:
>
> Members of this list might be interested in my write-up of ThomsonReuters'
> latest beta service... which I think will prove pretty useful in growing the
> Linked Data cloud... especially for news content from the BBC et al...
>
>
> http://blogs.zdnet.com/semantic-web/?p=194
>
>
> Paul
>
>
> --
>
> Paul Miller
>
> Technology Evangelist, Talis
>
> w: www.talis.com/ <http://www.talis.com/>  skype: napm1971
>
> mobile/cell: +44 7769 740083
>
>
> http://blogs.zdnet.com/semantic-web/
>
>
> _www.linkedin.com/in/pau1mi11er <http://www.linkedin.com/in/pau1mi11er>_
>
>
>
>
>
> Paul,
>
> How does this actually benefit or contribute to the Linked Data Cloud? I
> ask specifically because URIs (of the dereferencable variety) are missing in
> action. Hopefully, I am completely overlooking something here :-)
>
> We tend to use the term "Proxy" or "Wrapper" to describe solutions in the
> Linked Data realm that generate dereferencable URIs based RDF graphs  (aka.
> Linked Data Spaces) "on the fly" via RDF-ization middleware.
>
> If possible, please encourage the OpenCalais folks (Tom et al.) to respond
> to my comments above via a response to this post.
>
> Example Proxy / Wrapper URIs in the wild:
>
> 1.
> http://demo.openlinksw.com/proxy/html/http://www.freebase.com/view/en/abraham_lincoln- Document about Abraham Lincoln
> 2.
> http://demo.openlinksw.com/about/html/http://demo.openlinksw.com/about/rdf/http://www.freebase.com/view/en/abraham_lincoln%23this - Abraham Lincoln the Entity of type foaf:Person that is also a sioc:Item
> 3.
> http://demo.openlinksw.com/about/html/http://demo.openlinksw.com/about/rdf/http://www.crunchbase.com/company/thomson-reuters%23this- Thompson Reuters the Entity of type foaf:Organization that is also a
> sioc:Item
> 4. http://www4.wiwiss.fu-berlin.de/flickrwrappr/photos/Thomson_Reuters -
> Thompson Reuters photos from Flickr
> 5.
> http://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fflickrwrappr%2Fphotos%2FThomson_Reuters- Browser view of the data space exposed by the Flickr wrapper URI
>
>
>
> --
>
>
> Regards,
>
> Kingsley Idehen      Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO OpenLink Software     Web: http://www.openlinksw.com
>
>
>
>
>
>

Received on Wednesday, 24 September 2008 10:03:10 UTC