- From: Shashi Kant <skant@sloan.mit.edu>
- Date: Thu, 21 Oct 2010 12:54:01 -0400
- To: Jie Bao <baojie@gmail.com>
- Cc: Semantic Web <semantic-web@w3.org>
Received on Thursday, 21 October 2010 16:55:03 UTC
Look up Apache Lucene/Solr. They have a sub-project Apache Tika for dealing with PDFs etc. On Thu, Oct 21, 2010 at 12:31 PM, Jie Bao <baojie@gmail.com> wrote: > Hi > > I have a few hundreds papers in pdfs [I can easily extract text from > it]. I would like to run some tools to automatically discover tags or > keywords from them. Do you have any recommendation? > > Thanks in advance > > Jie > > ----- > Jie Bao > Tetherless World Constellation > Rensselaer Polytechnic Institute > baojie@cs.rpi.edu > http://www.cs.rpi.edu/~baojie <http://www.cs.rpi.edu/%7Ebaojie> > >
Received on Thursday, 21 October 2010 16:55:03 UTC