- From: RebholzSchuhmann <d.rebholz.schuhmann@gmail.com>
- Date: Tue, 07 May 2013 09:32:06 +0100
- To: beyond-the-pdf@googlegroups.com
- CC: Steve Pettifer <steve.pettifer@manchester.ac.uk>, Leonard Rosenthol <lrosenth@adobe.com>, Linking Open Data <public-lod@w3.org>, SW-forum <semantic-web@w3.org>
- Message-ID: <5188BC06.9070600@gmail.com>
Hi, I have seen similar discussions before. I guess, we look at two different use cases: (1) PDF: layout oriented, but could (and will, hopefully) carry a lot more semantics information. The key achievement is and will be to have optimal layout, and on the other side the overhead for processing / exploitation / reuse goes up for everybody who is NOT PDF-savvy. (2) the other open formats (Html, Xml, Pdf): allow easy-to-go exploitation, processing, and enrichment, and stand for the spirit of the open web and reuse of data. Listening to publishers, certainly layout matters. I am not only talking about the big five or ten who would have the resources to go a different direction, I am talking about the 1,000 smaller publishers who have to serve their community. They would struggle more to comply with the other "standards" and still deliver an appealing product. I guess, some clever thinking and collabortive work is required to bring both together. Hope this helps. -drs- On 07/05/2013 09:17, Steve Pettifer wrote: >> I assume most authors don't actually format their documents by selecting a font size for every single heading and so on. > This is a tempting assumption to make, especially if you come from computer science / maths / physics and related disciplines (as I do). But my experience in the life sciences is that authors do 'paint' their manuscripts by hand, painstakingly selecting the font and format for every bit of their document. Even using the 'semantic' features of wordprocessors (such as 'Heading 1') is something that's not commonplace. So before we get too carried away with expecting people to write HTML / LaTex or even markup, we'll need to take into account the working practises of the vast majority of academics outside of the more 'semantically aware' bits of science. > >> They work in a format that utilizes semantically meaningful information about the work: to identify a title, headings, math blocks, illustrations, plots, etc. > > No, they really don't. I wish they did. But, outside of a certain area of science, they don't. > > Steve > -- D. Rebholz-Schuhmann - mailto:d.rebholz.schuhmann@gmail.com
Received on Tuesday, 7 May 2013 08:32:52 UTC