Re: linked open data and PDF from Paul Houle on 2015-01-19 (public-lod@w3.org from January 2015)

From: Paul Houle <ontology2@gmail.com>
Date: Mon, 19 Jan 2015 17:24:19 -0500
To: Larry Masinter <masinter@adobe.com>
Cc: "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <CAE__kdSxy0uHHFm3BcBmNZO=n1c7-kENwHDbZM5PAXEO5mSY8w@mail.gmail.com>

I just used Acrobat Pro to look at the XMP metadata for a standards
document (extra credit if you know which one) and saw something like this

https://raw.githubusercontent.com/paulhoule/images/master/MetadataSample.PNG

in this particular case this is fine RDF,  just very little of it because
nobody made an effort to fill it in.  The lack of a title is particularly
annoying when I am reading this document at the gym because it gets lost in
a maze of twisty filenames that all look the same,

I looked at some financial statements and found that some were very well
annotated and some not at all.  Acrobat Pro has a tool that outputs the
data in RDF/XML;  I can't imagine it is hard to get this data out with
third party tools in most cases.

On Mon, Jan 19, 2015 at 2:36 PM, Larry Masinter <masinter@adobe.com> wrote:

> I just joined this list. I’m looking to help improve the story for Linked
> Open Data in PDF, to lift PDF (and other formats) from one-star to five,
> perhaps using XMP. I’ve found a few hints in the mailing list archive here.
> http://lists.w3.org/Archives/Public/public-lod/2014Oct/0169.html
> but I’m still looking. Any clues, problem statements, sample sites?
>
> Larry
> --
> http://larry.masinter.net
>
>

-- 
Paul Houle
Expert on Freebase, DBpedia, Hadoop and RDF
(607) 539 6254    paul.houle on Skype   ontology2@gmail.com
http://legalentityidentifier.info/lei/lookup

Received on Monday, 19 January 2015 22:24:46 UTC