- From: Jodi Schneider <jodi.schneider@deri.org>
- Date: Sun, 17 Oct 2010 20:31:08 +0100
- To: Jim Pitman <pitman@stat.Berkeley.EDU>
- Cc: public-lld@w3.org, open-bibliography@lists.okfn.org
- Message-Id: <63957687-030A-4187-B748-D37A601B62E1@deri.org>
Thanks, Jim. Interesting use case. It's now on the LLD wiki at http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Community_Information_Service Best, Jodi On 17 Oct 2010, at 19:34, Jim Pitman wrote: > Jodi, > > Here's another response to the call for use cases. > > many thanks for your assistance > > --Jim > ---------------------------------------------- > Jim Pitman > Director, Bibliographic Knowledge Network Project > http://www.bibkn.org/ > > Professor of Statistics and Mathematics > University of California > 367 Evans Hall # 3860 > Berkeley, CA 94720-3860 > > ph: 510-642-9970 fax: 510-642-7892 > e-mail: pitman@stat.berkeley.edu > URL: http://www.stat.berkeley.edu/users/pitman > ---------------------------------------------- > > === name === > > Community Information Service > > === Owner === > > Jim Pitman > http://www.stat.berkeley.edu/~pitman/ > > === Background and Current Practice === > > Academic organizations of varying sizes (research groups, university departments, > universities, university consortia, subject specific communities such as scholarly societies and special interest groups) > have a strong interest in maintaining awareness and quality of information in their domain, and in openly publishing this > information to the broader academic community and to the general public. > A significant component of this information is bibliographic metadata available from library resources, > especially information about books and articles published in a particular field, or associated with a particular > institution. > Current practice varies greatly. Many publishers and scholarly societies offer subscription-based A&I services which are paid > for by libraries. Typical license agreements limit these services to "individual" use. > This inhibits creative selection, remixing and republication of bibliographic metadata by interested individuals and organizations. > Another service is provided by Google Scholar. But again, selective harvesting and reuse of the data is inhibited by terms of use. > > Most university departments and universities are unable to extract from their university library catalogs a list of all publications > of their own faculty. Even if they could, they are typically not be allowed to publish it without renegotiating license agreements with > bibliographic metadata suppliers. > A typical subject-specific interest group may be able to extract subject-specific bibliographic metadata from a variety of sources. > But again, there is a high barrier to cross before the group can obtain clear rights to republish or remix such material. > Essentially, the group has to acquire some legal identity, capable of making licensing agreements, before it can do so legally. > Then the group has to find a business model capable of supporting some individual whose job it is to manage such agreements. > This organizational overhead is unnecessary in a universe of linked data. > > === Goal === > > Make libary catalog and other publisher-genetated bibliographic metadata freely available to community data curators so it is easily filtered by > author/affiliation/subject/... to allow large numbers of small to medium sized academic communities to easily extract what data is of particular interest to them, > with minimal technical and legal overhead, and to openly republish that data in ways they find worthwhile. For example, by selecting, ranking or > classifying the data, and providing simple searches and faceted displays over bibliographic collections of special interest to the community. > > How to use linked data technology to achieve this goal: provide the data with an open license which allows its reuse for such purposes, > and support the APIs, data standards and client software to lower the barrier to participation in information curation and sharing. > > === Target Audience === > > Scholars as service providers: all those who edit, curate and arrange scholarly information for the purpose of making it openly > accessible to a wide audience. > Indirectly, the general public which may find subject-specific resources curated by scholars more informative > than generic search services or Wikipedia. > Computer programs, inasmuch as these may be used for tasks of filtering, deduplication, selection, ... to save the time of expert curators. > > === Use Case Scenario === > > Curator of a community information service selects data from input sources to determine what books, articles, photographs, videos, .... > were published recently which would be of interest to the community. > Curator has input data available in such a way that they can easily control what is piped through to their information service. > > === Application of linked data for the given use case === > > Make it easy for data providers (publishers, libraries, other aggregators) to provide linked data with suitable API and client software > for community data curators to use. > Curators should expect that bibliographic records come equipped with identifiers for all entities > (editions, people, subjects, journals, publishers, .... ) and that this information is easily loaded into some > community managed CMS to allow remixing with whatever ranking/selection/faceting/... the community service may wish to provide. > > === Existing Work (optional) === > > Most A&I services maintain some data ingest systems for these purposes. But they are usually proprietary, and not readily available for use by smaller agents with > interests in biblio data curation. These mostly rely on converting raw publisher data into proprietary biblio formats for internal use, and licensing > data to libraries in degraded formats for use by supplicant scholars. These services add no value to the universe of linked data, but rather compete with it. > Some examples of software systems for open display of community curated bibliographic collections are > BibSonomy, BibServer, BibApp, Open Scholar. All of these systems would benefit from easy > availability of comprehensive linked library and publisher data via API. > An example of a typical community website which would benefit greatly from integration with linked data is the Probability Web. > See especially the lists of Books, People, and the link to the Probability Abstract Service, all of which could be > recreated to both import and export linked data. > There are more advanced services in other fields, especially RePEc (laudably open, but with large amounts of data whose license status is indeterminate) > and SSRN (free but not open to reuse). Such large community services are typically built with an architecture that is difficult to replicate. > What is needed is a simple and easily replicable architecture for community data curation services of various sizes to develop and interoperate. > BKNpeople and VIVO are starts in this direction at the level of identifying people and their interests. Integation of > such systems with the ORCID initiative will be important. See also the BKN Project. > > === Related Vocabularies === > > BIBO, CiTO, ... > > === Problems and Limitations === > > Reasons why this scenario is or may be difficult to achieve: > > Social/Economic/Legal > -- vested interests in A&I services > -- lack of suitably licensed metadata > -- commercial publishers, universities and conservative scholarly societies refusing to release their metadata with an open license > > Technical obstacles: > Lack of convergence towards a simple widely adopted standard for exchange of bibliographic metadata suitable for the community > information service use case. > The necessary data fields are little more than traditional bibtex fields, plus some conventions for handling entity identifiers and links. > BibJSON is an attempt at an adequate lightweight data exchange standard, compatible with linked data principles, > and influenced by the success of BibTeX and RePEc's Academic Metadata Format. > This standard is easily managed and understood by typical community data service managers, even without advanced software tools. > Providing and managing/adapting/maintaining good UIs for non-technical curators to manage BibJSON or similar record structures is the biggest technical challenge. > Also, supporting the necessary CMS over which these UIs can operate. > Needlebase shows promise of providing an adequate UI over a graphical datastore. > This is proprietary software, but it should be configurable to import and export linked data. Such systems for managing simple editorial > workflows over linked data are greatly needed. > > === Related Use Cases and Unanticipated Uses === > > If simple and easily affordable editorial systems are developed for managing collections of biblio data, it is hard to anticipate > which agents will emerge to provide the best services on various scales. Communities nest and overlap with each other. They > compete for the attention of their members. If communities export their enhancements as linked data, this data may be consumed again by larger aggregators, > especially Google and other big players, in ways which which should greatly improve current means of search and discovery of academic information. > > === References === > > Academic Metadata Format http://amf.openlib.org/doc/ebisu.html > arXiv http://arxiv.org/ > BibServer http://bibserver.berkeley.edu/cgi-bin/bibs7?source=http://www.stat.berkeley.edu/users/pitman/bibserver.bib > BibApp http://www.bibapp.org/ > BibJSON http://www.bibkn.org/bibjson/index.html > BibTeX http://en.wikipedia.org/wiki/BibTeX > BibSonomy http://www.bibsonomy.org/ > BIBO http://bibliontology.com/ > BKNpeople http://people.bibkn.org/ > BKN Project: http://www.bibkn.org/ > CiTO, the Citation Typing Ontology, by David Shotton. http://dx.doi.org/10.1186/2041-1480-1-S1-S6 > Google Scholar http://scholar.google.com/ > Needlebase http://www.needlebase.com/ > Open Scholar http://scholar.harvard.edu/ > ORCID http://www.orcid.org/ > Probability Abstract Service http://pas.imstat.org/ > RePEc http://repec.org/ > SSRN http://www.ssrn.com/ > The Probability Web http://www.mathcs.carleton.edu/probweb/probweb.html > VIVO http://www.vivoweb.org/ >
Received on Sunday, 17 October 2010 19:31:49 UTC