- From: Martynas Jusevičius <martynas@graphity.org>
- Date: Mon, 19 Jan 2015 23:35:39 +0100
- To: Paul Houle <ontology2@gmail.com>
- Cc: Larry Masinter <masinter@adobe.com>, "public-lod@w3.org" <public-lod@w3.org>
PDFBox includes metadata API, but does not mention RDF: https://pdfbox.apache.org/1.8/cookbook/workingwithmetadata.html On Mon, Jan 19, 2015 at 11:31 PM, Martynas Jusevičius <martynas@graphity.org> wrote: > Hey all, > > I think APIs for common languages like Java and C# to extract XMP RDF > from PDF Files/Streams would be much more helpful than standalone > tools such as Paul mentions. > > I've looked at Adobe PDF Library SDK but none of the features mention metadata: > http://www.adobe.com/devnet/pdf/library.html > > > Martynas > graphityhq.com > > On Mon, Jan 19, 2015 at 11:24 PM, Paul Houle <ontology2@gmail.com> wrote: >> I just used Acrobat Pro to look at the XMP metadata for a standards document >> (extra credit if you know which one) and saw something like this >> >> https://raw.githubusercontent.com/paulhoule/images/master/MetadataSample.PNG >> >> in this particular case this is fine RDF, just very little of it because >> nobody made an effort to fill it in. The lack of a title is particularly >> annoying when I am reading this document at the gym because it gets lost in >> a maze of twisty filenames that all look the same, >> >> I looked at some financial statements and found that some were very well >> annotated and some not at all. Acrobat Pro has a tool that outputs the data >> in RDF/XML; I can't imagine it is hard to get this data out with third >> party tools in most cases. >> >> >> On Mon, Jan 19, 2015 at 2:36 PM, Larry Masinter <masinter@adobe.com> wrote: >>> >>> I just joined this list. I’m looking to help improve the story for Linked >>> Open Data in PDF, to lift PDF (and other formats) from one-star to five, >>> perhaps using XMP. I’ve found a few hints in the mailing list archive here. >>> http://lists.w3.org/Archives/Public/public-lod/2014Oct/0169.html >>> but I’m still looking. Any clues, problem statements, sample sites? >>> >>> Larry >>> -- >>> http://larry.masinter.net >>> >> >> >> >> -- >> Paul Houle >> Expert on Freebase, DBpedia, Hadoop and RDF >> (607) 539 6254 paul.houle on Skype ontology2@gmail.com >> http://legalentityidentifier.info/lei/lookup
Received on Monday, 19 January 2015 22:36:07 UTC