- From: Alexander Garcia Castro <alexgarciac@gmail.com>
- Date: Fri, 3 May 2013 01:10:19 +0200
- To: Norman Gray <norman@astro.gla.ac.uk>
- Cc: Casey McLaughlin <casey.mclaughlin@cci.fsu.edu>, Linking Open Data <public-lod@w3.org>
Norman, I would love to be just as optimistic as u seem to be. unfortunately, due to the evidence I just cant. in any case, I would like to invite u to a hackathon we are currently organizing. Join us in Montpellier for a one-day event to hack on scholarly PDFs! Currently, the bulk of peer-reviewed scientific knowledge is locked up in PDF documents, which are difficult to get information . We want to change that. If you’re interested in hacking on PDFs and exploring ways to access scholarly data in modern ways, this hackathon is for you. http://scholrev.org/hackathon/ On Fri, May 3, 2013 at 12:36 AM, Norman Gray <norman@astro.gla.ac.uk> wrote: > > Alexander, hello. > > On 2013 May 2, at 22:49, Alexander Garcia Castro <alexgarciac@gmail.com> wrote: > >> Hi Norman, I have heard the same from ADOBE people. its not the PDF it >> is YOU not wise enough as to know how to generate a PDF. >> Unfortunately, I dont work with PDFs generated by me, I have to deal >> with those coming from publishers; probably they should attend a >> training for generating PDFs. > > Hence my comment that journals could very usefully give more of a lead here. > > I think that some journal publishers _are_ trying to do things here, partly in order to back up their assertions that they add value to the publication process, but also to address their own production problems. It was Elsevier who sponsored an 'Executable PDF' challenge <http://www.executablepapers.com/>. Various other people are putting effort in as well, obviously, but as you point out, the publishers have to be involved. I have a couple of links at <https://pinboard.in/u:nxg/t:beyondpdf> > > Like I said: it's only fairly recently that the desire to put metadata into PDFs has spread beyond a few nuts. The area is still pretty immature. > >> It is great to hear ":libraries for destructuring and rummaging around >> in PDFs are not very easy to use (no need for 'jailbreaking'". Please >> point us to such libraries and tutorials for destructuring the PDF. So >> far, for practical purposes the content is locked up and in deep need >> for jail braking so that it can be effectively used. But, as u pointed >> out, it may be just because we dont know how to generate PDFs. BTW, I >> am ccing this to Casey, we work together and we are eager to hear >> about those libraries. > > Well, there's pdflib <http://www.pdflib.com/>, which is expensive but clearly supported, libpdf <https://sourceforge.net/projects/libpdf/>, which is free but which I know nothing about, and PDFBox <http://pdfbox.apache.org/> which is also free, and which I've made light use of, in order to extract metadata from PDFs into an Atom feed (I can share this with you if you want, but it's not really polished). > > There are some libraries mentioned at <http://en.wikipedia.org/wiki/List_of_PDF_software> > > There's probably more, but that might be a start. Were you trying to grok PDF straight from the spec? Hardcore! > > All the best, > > Norman > > > -- > Norman Gray : http://nxg.me.uk > SUPA School of Physics and Astronomy, University of Glasgow, UK > -- Alexander Garcia http://www.alexandergarcia.name/ http://www.usefilm.com/photographer/75943.html http://www.linkedin.com/in/alexgarciac
Received on Thursday, 2 May 2013 23:11:14 UTC