RE: [open-bibliography] Call for Use Cases: Library Linked Data

Here's another response to the call for use cases.

> -----Original Message-----
> From: open-bibliography-bounces@lists.okfn.org [mailto:open-
> bibliography-bounces@lists.okfn.org] On Behalf Of William Waites
> Sent: 07 September 2010 21:51
> To: List for Working Group on Open Bibliographic Data
> Subject: [open-bibliography] Call for Use Cases: Library Linked Data
>
> W3C Library Linked Data Incubator Group -
> http://www.w3.org/2005/Incubator/lld/

>
> Call for Use Cases: Library Linked Data
>
>
> ================================================================
>
> === Name ===
>
> A short name by which we can refer to the use case in discussions.

Talis Prism 3

> === Owner ===
>
> The contact person for this use case.

Phil John <phil.john@talis.com>

> === Background and Current Practice ===
>
> Where this use case takes place in a specific domain, and so requires
> some prior information to understand, this section is used to describe
> that domain. As far as possible, please put explanation of the domain
> in
> here, to keep the scenario as short as possible. If this scenario is
> best illustrated by showing how applying technology could replace
> current existing practice, then this section can be used to describe
> the
> current practice. Often, the key to why a use case is important
> also lies in what problem would occur if it was not achieved, or what
> problem means it is hard to achieve.

Talis Prism 3 is a next-generation OPAC/search and discovery interface. We need to offer a rich interface to surface the large volume of content available in libraries. Browsing by entities such as author, subject and series is important, as is the reliable extraction of data from MARC 21 into a linked data model.

In some existing systems, the data is handled as literal text with no notion of concrete entities such as a particular author, their relation to other authors (e.g. pseudonyms), or their actual contribution (e.g. editor, illustrator etc.).

> === Goal ===
>
> Two short statements stating (1) what is achieved in the scenario
> without reference to linked data, and (2) how we use linked data
> technology to achieve this goal.

1. A rich search and discovery interface for libraries
2. Prism 3 is powered by the Talis Platform, a hosted linked data service which offers both SPARQL querying and powerful full text search capabilities. We’ve made certain parts of the bibliographic data first class entities (author, but soon to include title and subject); this will allow us to offer multiple entry points into the data, either by search and faceting, or browsing through author and subject indexes. By making title and author first class entities we can also identify alternate versions of resources and allow that information to be discovered, e.g. finding all editions of a book or finding all versions of a work (book, film, theatrical adaptation etc.). We are also looking to link to data from external resources such as dbpedia for author/person biographies and MusicBrainz for track listings/artist information.

> === Target Audience ===
>
> The main audience of your case. For example scholars, the general
> public, service providers, archivists, computer programs...

Public and academic libraries and their users.

> === Use Case Scenario ===
>
> The use case scenario itself, described as a story in which actors
> interact with systems. This section should focus on the user needs in
> this scenario. Do not mention technical aspects and/or the use of
> linked data.

Users can:

• Search for books/resources by simple keyword, targeted index (author, ISBN, title etc) or a combination of both; Boolean operators and groupings are also supported
• Refine their results using facets
• View item level details
• Expand their search by browsing to subjects/authors contained in those records, e.g. find all other books written or contributed to by this author, find all books that are about this author, find all books about World War 2 etc.
• Enjoy serendipitous discovery of other items they may not have been searching for, but would find interesting nonetheless
• Experience intelligent suggestions to broad searches, e.g. searching for Shakespeare should offer a link to a biographical page about him, as well as returning lists of books that satisfy the query.

We also want to enable developers to build applications and mashups on top of the bibliographic data.

> === Application of linked data for the given use case ===
>
> This section describes how linked data technology could be used to
> support the use case above. Try to focus on linked data on an abstract
> level, without mentioning concrete applications and/or vocabularies.
> Hint: Nothing library domain specific.

By promoting as many parts of the record as possible to first class entities, opening up a browsable interface becomes much easier. Describing all resources attached to another one is very powerful and provides far better results than by using full text searching alone. It also allows richer integration with other data sources.

> === Existing Work (optional) ===
>
> This section is used to refer to existing technologies or approaches
> which achieve the use case (Hint: Specific approaches in the library
> domain). It may especially refer to running prototypes or applications.

Some other OPAC’s appear to use full text search features to power a browsing interface, e.g. synonym searching for lists of related items/terms. Many traditional systems also use relational databases as their underlying data storage layer.

> === Related Vocabularies (optional) ===
>
> Here you can list and clarify the use of vocabularies (element sets and
> value vocabularies) which can be helpful and applied within this
> context.

FOAF - basic person (author/character) information
Bibliontology - bulk of the bibliographic record
Dublin Core - more general bibliographic elements (title, creator etc.) as well as medium/format
BIO - biographical information about people (birth/death date)
Music Ontology - audio/musical catalogue items
Organizational Ontology - corporate authorship

> === Problems and Limitations (optional) ===
>
> This section lists reasons why this scenario is or may be difficult to
> achieve, including pre-requisites which may not be met, technological
> obstacles etc. Please explicitly list here the technical challenges
> made
> apparent by this use case. This will aid in creating a roadmap to
> overcome those challenges.

The linked-data aspect of the project is allowing us great flexibility - there have been a few cases where finding the correct ontology/class/property to use has been problematic, but this can be overcome either by working with the ontology creators to broaden an existing one, or by defining our own.

The biggest technical challenge has been with extracting information from MARC 21 records, as much of the content is comprised of literals. Promoting authors to first class objects requires the use of external data sources such as authority files (which we’ve also modelled as linked data). The same hurdles will need to be overcome when we start dealing with subjects as entities.

There is also the problem of augmenting the data we already have: one example is birth/death information for authors, which isn't always catalogued in a MARC 21 record; this necessitates the use of other resources that can fill in missing details. With current systems, keeping this external data alongside the catalogue data in a single store is important for quick responses to queries and robust continuation of service, so we have to deal with the issue of keeping this data current - if an author dies we need to track this event and update our store, preferably in an automated fashion.

> === Related Use Cases and Unanticipated Uses (optional) ===
>
> The scenario above describes a particular case of using linked data.
> However, by allowing this scenario to take place, the likely solution
> allows for other use cases. This section captures unanticipated uses of
> the same system apparent in the use case scenario.

We have made a conscious decision to include a Linked Data API in Talis Prism 3, scheduled for release early next year. By surfacing the linked-data in a machine readable way through this interface we’re hoping to enable unanticipated uses as developers harness the data in other applications.

Some example use cases may include:

• Surfacing library data in a mobile application
• Allowing aggregation many institution's holdings into a shared union catalogue
• Enabling resource/reading list systems to automatically harvest information to populate their interfaces (one example would be Talis Aspire, a linked data resource list management system)

> === References (optional) ===
>
> This section is used to refer to cited literature and quoted websites.

Talis Prism 3: http://www.talis.com/prism/


Talis Platform: http://www.talis.com/platform/


Talis Aspire: http://www.talis.com/aspire/

MARC 21: http://www.loc.gov/marc/


Please consider the environment before printing this email.

Find out more about Talis at http://www.talis.com/

shared innovation™

Any views or personal opinions expressed within this email may not be those of Talis Information Ltd or its employees. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited.

Talis Information Ltd is a member of the Talis Group of companies and is registered in England No 3638278 with its registered office at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.

Received on Friday, 15 October 2010 14:25:43 UTC