Re: [dpub] use case(s) for scholarly publishing from Heather Flanagan (RFC Series Editor) on 2016-03-22 (public-digipub-ig@w3.org from March 2016)

From: Heather Flanagan (RFC Series Editor) <rse@rfc-editor.org>
Date: Tue, 22 Mar 2016 06:44:45 -0700
To: public-digipub-ig@w3.org
Message-ID: <56F14C4D.7050508@rfc-editor.org>
On 3/18/16 10:36 AM, Siegman, Tzviya - Hoboken wrote:
>
> Hello DPUB,
>
>  
>
> Ivan has put together a set of use cases from the perspective of
> scholarly publishing. The two of  us had some back and forth about the
> issues and thought we’d bring the discussion to the list.
>
>  
>
> Some overarching questions for Romain and Heather:
>
>  
>
> 1.       Would you prefer that we leave this as one long and complex
> use case or break it into the dozen or so simple use cases that are
> reflected in the requirements section at the end?
>
> 2.       Would you prefer that we keep the focus on scholarly
> publishing, even when the examples extend far beyond scholarly so that
> we demonstrate the real-world need?
>
>  
>

Hi all,

If we can break it down, I think that would be more practical in the
end. Starting with the big, complex use case is useful, but to implement
the requirements, we need to make it a bit more consumable.

Re: scholarly publishing, that probably captures the primary use case
for the PWP, but I'd like folks to think about where there might be
something unique out there that fits in more popular web-based
publishing. I'll likely add things for standards publishing as well.

-Heather

>  
>
> [[[
>
>  
>
> # Scientific publication use case/requirements
>
>  
>
> By scientific publication we mean scholarly communications, or
> collections thereof (i.e., proceedings, journal volumes, etc) on the
> Web. Although related, scientific/STEM books is a different category
> and is not particularly focused on in this sequel.
>
>  
>
> ## Important aspects
>
>  
>
> ### Identity
>
>  
>
> Having a stable, _unique_ identity is a non-negotiable necessity for a
> scientific publication. This identity _MUST_ be independent of the
> state (offline, online, etc), but also of the format (printed, online
> in HTML, PDF, Word, whatever). Although the existing identifier
> schemes (eg, DOI) typically have a mapping onto a locator (typically a
> Web page), their primary role is to server as unique identification,
> and their role as locators on the Web is mostly incidental and indirect.
>
>  
>
>  
>
> ### Metadata
>
>  
>
> Just as identity, a number of metadata items are essential for
> scientific publications. It is essential for that metadata to abide to
> the requirements of specific vocabularies, identity of authors (e.g.,
> possibility to use ORCID), publications, etc. The metadata _structure_
> must be as open as possible, giving possibilities to easily connect to
> various external databases and vocabularies.
>
>  
>
> Metadata should also be _searchable_. Finding the metadata, and
> include the information therein in the specialized database of various
> search engines, is a way to ensure that a scientific publication is
> known or not. In practice, this means that the metadata should not be
> hidden in an archive (that search engines rarely consume) but should
> be available on the open Web. 
>
>  
>
> ### Dynamic content
>
> TS: note effect on archive format and definition of scientific record
>
>  
>
> IH: And it indeed touches on archival issues of, say, programs, but I
> am not sure that is relevant for our use cases. WDYT?
>
>  
>
>
>
>
> The time when a scientific publication was, essentially, equal to a
> printed, paper article (or its reproduction in PDF) is becoming
> obsolete. Scientific publications may include a collection of
> different resources of different format, of which a textual content
> may only be one. Scientific publication today may include, or indeed
> _be_, a data set, a scientific software, video, audio, etc. The future
> is an interrelated collection of resources that, _together_, form a
> logical unit, i.e., _is_ the scientific publication itself.
>
>  
>
> Note that a scientific publication may also include very dynamic
> content, e.g., a program/script that produces interactive
> visualization of data, that can be run by the reader to show some
> algorithms working, etc.
>
>  
>
> ### Consuming scientific publication
>
>  
>
> In practice, readers, users, etc., may want to, say, read a scientific
> publication under very different circumstances. It is a very
> widespread usage pattern for a scientist to read such publication
> while commuting, i.e., being offline, using different devices
> (different aspect ratio, screen sizes, etc). Having some sort of a
> bookmarking to ensure smooth moving among devices is important. 
>
>  
>
> This also means that the scientific publication should, as much as
> possible, adaptable to the reading/consuming environment in terms of
> text size, some fundamental rendering aspects (one column or two,
> color usage, etc.). The fact whether a specific software or hardware
> can perform certain dynamic features (eg, execute a complex program)
> should also be taken into account; it should be possible to add to the
> communication's metadata/manifest which content is essential for the
> faithful rendering of the content and which one can have some fallbacks.
>
>  
>
> TS: This is really about offlinification and personalization
>
>  
>
> Yes. But it also shows that 'offlinification' may be more than just
> simply have all the resources (videos, audio, etc) around, it is a
> more complex issue. The same with personalization
>
>
>  
>
> ### Annotation
>
>  
>
> Annotation is another essential feature for scientific publications.
> Annotations come in many forms, from simple highlighting of a sentence
> in the text, to complex (and possibly highly formatted) attached
> content, containing mathematics, drawings, etc. 
>
>  
>
> Annotations play a role at various points in the publication usage;
> the most typical are, on the one hand, the peer-review system playing
> an essential part in the publication process and, on the other hand,
> annotations on the final, published communication. Mainly the latter
> (but, increasingly, the former, too) kind of annotations are often
> made public, too. This means that annotations, created by a reader
> offline, should not only migrate as part of the publication itself
> when getting online but, increasingly, should also be stored
> automatically in public annotation servers automatically.
>
>  
>
>  
>
> ### Hierarchy of communications
>
>  
>
> Proceedings, article collections, journals have lots of similarities
> to "simple" communications (need of identity, metadata, etc), but have
> the additional feature of being a collection of communications that
> have all their own identity, too.
>
>  
>
> ## Derived requirements (draft)
>
>  
>
> * Separation of identity from locators
>
> * Possibility of using, possibly, external metadata
>
> * Metadata/manifest structure should be easily extensible and adaptable
>
> * Smooth transfer between different states (offline, online, etc)
>
> * Identification of a 'bookmark' in a state independent way
>
> * Identification of what is an 'essential' content and what is not
>
> * Inclusion of many different media
>
> * Publications that may contain, practically, no textual information
> (e.g., scientific software as scientific publication)
>
> * Possibility for flexible annotations, not necessarily _included_ in
> the communication itself
>
> * Easy self adaptation on reading environment, user styling
>
> * Hierarchical view of portable publications?
>
>  
>
>  
>
> ]]]
>
>  
>
>  
>
>  
>
> *Tzviya Siegman*
>
> Digital Book Standards & Capabilities Lead
>
> Wiley
>
> 201-748-6884
>
> tsiegman@wiley.com <mailto:tsiegman@wiley.com>
>
>  
>
>  
>
>  
>
Received on Tuesday, 22 March 2016 13:45:19 UTC