W3C home > Mailing lists > Public > public-lod@w3.org > May 2009

Re: Chronicling America and Linked Data pt. 2

From: Ed Summers <ehs@pobox.com>
Date: Tue, 26 May 2009 15:09:17 -0400
Message-ID: <f032cc060905261209l64ed8a09ua111ae8116ea7123@mail.gmail.com>
To: Dan Brickley <danbri@danbri.org>
Cc: "public-lod@w3.org" <public-lod@w3.org>
On Tue, May 26, 2009 at 12:33 PM, Dan Brickley <danbri@danbri.org> wrote:
> BTW Ed, Chronicling America looks great. Nice work as ever :) Is there any
> people-describing data in there, or will there be?

Thanks Dan. It's funny you ask, because so far the predominant users
of Chronicling America have been genealogists looking for people :-)

At the moment (apart from metadata about newspaper titles, and issues)
we only have OCR for the pages, which allows us to see what "words"
appear where on the page for indexing and highlighting. We definitely
want to do more mining of the text, looking for names, dates, places,
etc embedded in the articles. Part of the hope is that by releasing
the data interested people can mine it themselves, and publish
assertions like:

dct:subject <http://dbpedia.org/resource/Abraham_Lincoln> .

Perhaps we'll eventually get around to providing functionality in the
app that makes it easy to do this sort of annotation. At the moment
there aren't any plans to do so, but we're already tossing around
ideas for a new user interface, so I am hopeful.

If you have any ideas please feel free to send them here or to me privately.

Received on Tuesday, 26 May 2009 19:09:56 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:15:56 UTC