Re: linked open data and PDF from Sarven Capadisli on 2015-01-19 (public-lod@w3.org from January 2015)

From: Sarven Capadisli <info@csarven.ca>
Date: Mon, 19 Jan 2015 21:20:12 +0100
To: Larry Masinter <masinter@adobe.com>, "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <54BD66FC.1010202@csarven.ca>
On 2015-01-19 20:36, Larry Masinter wrote:
> I just joined this list. I’m looking to help improve the story for Linked Open Data in PDF, to lift PDF (and other formats) from one-star to five, perhaps using XMP. I’ve found a few hints in the mailing list archive here.
> http://lists.w3.org/Archives/Public/public-lod/2014Oct/0169.html
> but I’m still looking. Any clues, problem statements, sample sites?
>
> Larry
> --
> http://larry.masinter.net
>

Hi Larry,

First off, I totally acknowledge your interest to improve the state of 
things for PDF.

I'm welcome to be proven wrong, but for the "big picture", I don't 
believe that LaTeX/XMP/PDF is the way to go for LD-friendly - perhaps 
efforts for that better invested elsewhere. There are a number of issues 
and shortcomings with the PDF approach which in the end will not play 
well with the Web is intended to be, nor how it functions. Most 
importantly, it is not fault tolerant, machine-friendly (regardless of 
what can be stuffed into XMP), and will not scale. At the end of the 
day, PDF is a silo-document, its rendering is a resource-hog in 
different devices, and it is not a ubiquitous reading/interactive 
experience in different devices.

For XMP/PDF to work, I presume you are going to end up dealing with 
RDF/XML, and an appropriate interface for authors to mark their 
statements with. Keep in mind that, this will most likely treat the data 
as a separate island, disassociated from the context in which it appears in.


May I invite you to read:

http://csarven.ca/enabling-accessible-knowledge

It covers my position in sufficient depth - not intended to be overly 
technical, but rather covering the ground rules and ongoing work.

While you are at it, please do a quick print-view from your Web browser 
(preferably in Firefox) or print to PDF.

The RDF bits are visible here:

http://www.w3.org/2012/pyRdfa/extract?uri=http%3A%2F%2Fcsarven.ca%2Fenabling-accessible-knowledge&rdfa_lite=false&vocab_expansion=false&embedded_rdf=true&validate=yes&space_preserve=true&vocab_cache_report=false&vocab_cache_bypass=false

I will spare you the details on what's going on there, unless you really 
want to know, but to put it in a nutshell: it covers statements dealing 
with sections, provenance, references/citations..

Here is another paper: http://linked-reseach.270a.info/ (which can just 
as well be a PDF - after all, PDF is just a view), which in addition to 
above, includes more atomic things like hypothesis, variables, workflows, ..


The work is based on Linked Research:

https://github.com/csarven/linked-research

If you are comfortable with your browser's developer toolbar, try 
changing the stylesheet lncs.css in <head> to acm.css.

There is a whole behavioural/interactive layer which I'll skip over now, 
but you can take a look at it if you fancy JavaScript.

As you may have already noticed, the HTML template is flexible enough 
for "blog" posts and "papers" - again, this is about separating the 
structure/content from the other layers: presentation, and behaviour.


Any feedback, questions, always welcome!

-Sarven
http://csarven.ca/#i
Received on Monday, 19 January 2015 20:20:46 UTC