- From: Paola Di Maio <paola.dimaio@gmail.com>
- Date: Thu, 4 Apr 2013 16:24:39 +0530
- To: John Erickson <olyerickson@gmail.com>
- Cc: Hatem Ben Yacoub <hatemben@gmail.com>, "eGov IG (Public)" <public-egov-ig@w3.org>
- Message-ID: <CAMXe=Spn9xM4i-h1X=3zPidkmg3DLk8DW-OUbwkDkRng7r3WOQ@mail.gmail.com>
Indeed looks good balance of simplicity and useful functionality, nice and reminds me of the 'tabulator' concept a bit more trimmed Wonder why there is no conversion to RDF? can we not also have a CSV to RDF button? would that not make sense? PDM On Thu, Apr 4, 2013 at 3:48 PM, John Erickson <olyerickson@gmail.com> wrote: > Hatem, this is an extremely interesting tool! Note to everyone: even > though Mozilla was one of the supporters, it works in all browsers. > Or, at least also Chrome ;) > > A couple suggestions: > 1. In addition to enabling the user to download and copy the selected > table segment, please provide a way (or at least start thinking about > a way) for there to be a permanent/re-usable/reliable URL to the > selected content. The reason is, some of us have RDF conversion > workflows that document the provenance, starting with the download URL > of the source CSV. > 2. I can understand how headers present a problem..but it would be > extremely useful to have them working! Maybe you can extract them > first, then associate them with selected table segments on a follow-up > pass. But you'll need to have created a URL for the selected header > cells ;) NOTE: One compromise is to only do COMPLETE tables if the > headers are to be included. > 3. Related to the above, you really need to encode provenance (see W3C > PROV) for this to really be useful to people using extracted tabular > data "in anger." > > Thanks again for this good work! > > John > > On Wed, Apr 3, 2013 at 4:25 PM, Hatem Ben Yacoub <hatemben@gmail.com> > wrote: > > Hi all, > > > > One of the problems that many Open Government data projects faces is > > the availability of tons of old documents in PDF format, which is not > > open and reusable format. Today, Mozilla announced Tabula, a new tool > > to help liberate tables trapped in PDFs. > > > > The online demo is amazing : http://tabula.nerdpower.org/ > > > > To use it simply make a rectangular selection over tables on the PDF > > pages. (Avoid headers) > > > > Sources https://github.com/jazzido/tabula > > > > Official announcement : > > http://source.mozillaopennews.org/en-US/articles/introducing-tabula/ > > > > > > Best, > > -- > > Eng. Hatem Ben Yacoub > > ICT & eGOV Consultant > > http://hbyconsultancy.com > > > > http://twitter.com/hatem > > http://facebook.com/hatemben > > > > > > -- > John S. Erickson, Ph.D. > Director, Web Science Operations > Tetherless World Constellation (RPI) > <http://tw.rpi.edu> <olyerickson@gmail.com> > Twitter & Skype: olyerickson > >
Received on Thursday, 4 April 2013 10:55:10 UTC