Re: linked open data and PDF from Martynas Jusevičius on 2015-01-19 (public-lod@w3.org from January 2015)

From: Martynas Jusevičius <martynas@graphity.org>
Date: Mon, 19 Jan 2015 23:31:46 +0100
To: Paul Houle <ontology2@gmail.com>
Cc: Larry Masinter <masinter@adobe.com>, "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <CAE35VmxRXZsw1ebROv1E+0sdtEZbK4LOvJ0_qN0r2gK1SoawOA@mail.gmail.com>

Hey all,

I think APIs for common languages like Java and C# to extract XMP RDF
from PDF Files/Streams would be much more helpful than standalone
tools such as Paul mentions.

I've looked at Adobe PDF Library SDK but none of the features mention metadata:
http://www.adobe.com/devnet/pdf/library.html


Martynas
graphityhq.com

On Mon, Jan 19, 2015 at 11:24 PM, Paul Houle <ontology2@gmail.com> wrote:
> I just used Acrobat Pro to look at the XMP metadata for a standards document
> (extra credit if you know which one) and saw something like this
>
> https://raw.githubusercontent.com/paulhoule/images/master/MetadataSample.PNG
>
> in this particular case this is fine RDF,  just very little of it because
> nobody made an effort to fill it in.  The lack of a title is particularly
> annoying when I am reading this document at the gym because it gets lost in
> a maze of twisty filenames that all look the same,
>
> I looked at some financial statements and found that some were very well
> annotated and some not at all.  Acrobat Pro has a tool that outputs the data
> in RDF/XML;  I can't imagine it is hard to get this data out with third
> party tools in most cases.
>
>
> On Mon, Jan 19, 2015 at 2:36 PM, Larry Masinter <masinter@adobe.com> wrote:
>>
>> I just joined this list. I’m looking to help improve the story for Linked
>> Open Data in PDF, to lift PDF (and other formats) from one-star to five,
>> perhaps using XMP. I’ve found a few hints in the mailing list archive here.
>> http://lists.w3.org/Archives/Public/public-lod/2014Oct/0169.html
>> but I’m still looking. Any clues, problem statements, sample sites?
>>
>> Larry
>> --
>> http://larry.masinter.net
>>
>
>
>
> --
> Paul Houle
> Expert on Freebase, DBpedia, Hadoop and RDF
> (607) 539 6254    paul.houle on Skype   ontology2@gmail.com
> http://legalentityidentifier.info/lei/lookup

Received on Monday, 19 January 2015 22:32:14 UTC