W3C home > Mailing lists > Public > public-digipub-ig@w3.org > September 2015

RE: Best citation format for accessibility

From: Bill Kasdorf <bkasdorf@apexcovantage.com>
Date: Tue, 22 Sep 2015 15:50:19 +0000
To: Bill McCoy <bmccoy@idpf.org>, Robin Berjon <robin@berjon.com>
CC: W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-ID: <CY1PR0601MB14229E31F68166CD711F573BDF450@CY1PR0601MB1422.namprd06.prod.outlook.com>
Virtually all citations in scholarly journals (talking in the millions of them) originate in an XML format known as JATS (Journal Article Tag Suite, a NISO standard, previously and often still referred to generically as "NLM XML"). JATS has a very granular citation structure (two alternative approaches, one more strict than the other) that semantically delineates the components of a citation. This is essential to how CrossRef works, which is the citation resolution service pretty much universally used in scholarly publishing. A CrossRef DOI is registered for each paper; CrossRef provides a service that returns the proper DOI for a properly tagged citation based on the metadata registered for the cited paper and the tagging in the citing paper; and the DOI (which CrossRef specified should be expressed as an actionable link, a URI) goes in the citation when it is published so that it links to the cited paper.

Why I went into that detail is that this modeling and infrastructure is virtually universally used in scholarly publishing, which means that almost all citations are already semantically tagged at a pretty (and in some cases extremely) granular level—but in an XML model that is not HTML. My point is that from an accessibility point of view, the components of a citation are generally already known; the challenge is getting them from JATS to HTML.

--Bill Kasdorf

From: Bill McCoy [mailto:bmccoy@idpf.org]
Sent: Tuesday, September 22, 2015 11:02 AM
To: Robin Berjon
Cc: W3C Digital Publishing IG
Subject: Re: Best citation format for accessibility

The EDUPUB profile of EPUB 3 defines a number of semantic superstructures, many are adapted from the DocBook XML schema including "biblioentry". http://www.idpf.org/epub/profiles/edu/structure/ .

Whether the mapping of DocBook semantics to HTML5 via the EDUPUB profile is the "best" approach for citations is surely debatable but I believe it will support the semantic-based content reformatting you posit and as such will be good for a11y (which has been a primary concern in the EDUPUB initiative).


On Tue, Sep 22, 2015 at 7:41 AM, Robin Berjon <robin@berjon.com<mailto:robin@berjon.com>> wrote:

citations in scholarly publishing have a long history of at-time
acrimonious disagreement over the exact format one should set them in.
There can be long arguments about the how and why of some specific
detail, but these are all about visual presentation. I have yet to hear
someone discuss the best format to use for the *content*, when in
digital form, such that it is most accessible.

By applying some technology, we can reformat a citation for visual
rendering. We can even make citation formatting follow readers'
preferences rather than publishers'. But when doing so the HTML-level
encoding of the citations should be optimised for semantic, non-visual

So my question is: has anyone given thought to what the best order of
content and best markup practices would be for optimally accessible


Robin Berjon - http://berjon.com/ - @robinberjon


Bill McCoy
Executive Director
International Digital Publishing Forum (IDPF)
email: bmccoy@idpf.org<mailto:bmccoy@idpf.org>
mobile: +1 206 353 0233

Received on Tuesday, 22 September 2015 15:50:54 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:12 UTC