In defence of doc-*ref roles from Silvio Peroni on 2015-11-17 (public-dpub-aria@w3.org from November 2015)

From: Silvio Peroni <silvio.peroni@unibo.it>
Date: Tue, 17 Nov 2015 12:12:39 +0100
To: <public-dpub-aria@w3.org>
Message-ID: <EC13E94A-B78D-45E4-8461-23B7A3EBDCB5@unibo.it>
Hi all,

These that follow are just my personal thoughts on the subject, maturated as a consequence of my past works on markup languages for (Italian) publishers and my ongoing work on RASH (http://github.com/essepuntato/rash).

I've just finished to read the new editor's draft of

[DPub] Digital Publishing WAI-ARIA Module 1.0, W3C Editor's Draft 17 November 2015, https://rawgit.com/w3c/aria/master/aria/dpub.html

and I've found several issues open related to the use of doc-*ref roles (e.g., doc-biblioref and doc-noteref). As far as I understood from previous communications, it seems that one option would be to remove such roles in favour of pure links. While I can see the reason behind this choice - for instance, to reduce the number of roles in [DPub] to the minimum needed – I'm not sure that removing such roles would be a favorable approach, considering that the aim of [DPub] is to provide an ontology of roles for being used within the digital publishing industry, and that cross-references are part of any published document (at least in the scholarly domain).

To me, the actual "intention" of someone who is creating a (cross-)reference to a typical sub-component (e.g., a bibliographic entry, a section, a footnote, a figure) in a digital publishing document is radically different from the one related to the creation of a link, even if this link is internal.

From a pure markup perspective, a "cross-reference" is just a pointer to a certain internal document sub-component, and the way it would be rendered within the document is, usually, an issue related with the presentation layer, rather than with the content layer. For instance, in LaTeX this is done by means of the entity "\ref{}" (and with "\cite{}" in case of references to bibliographic entities), in DocBook the element "xref" is used. Of course, they are sort of links, but they convey a specific semantics, which is the one of cross-referencing sub-components of a document.

Instead, for common links, such markup languages use usually a different approach ("\url{}" in LaTeX and the element "link" in DocBook), so as to clearly separate them to cross-references. This is due to the fact that author's intention of creating a link is *totally* different from the intention of creating a cross-reference. In addition to that, usually common links surrounds the text that should be made highlighted (they contain something), while cross-references are just pointers (such as empty elements) to other sub-components and the actual text of such cross-references is usually generated automatically according to the particular presentation one would give to the document.

Of course, in the context of HTML documents, it is possible to create links to particular sub-components, such as figures. But, still, it doesn't mean to create a cross-reference to them, it is just a link. For clarifying, please consider the following excerpts expressed using the RASH syntax:

1. <p>In this paragraph I would refer to <a href="#bibref-1 role="doc-biblioref"></a> for citing purposes.</p>

2. <p>In this paragraph I would create a <a href="#bibref-1">link to a particular object, i.e. a bibliographic entry, within this document</a>.</p>

The author's intention in 1) is to refer to a particular bibliographic entry in her document. How this reference would be rendered doesn't matter from the content layer, and she doesn't have to care about. The importance, here, is the presence of such pointer rather how it would be rendered – that would be "[1]", "(Smith et al, 2015)", or whatever else. In 2) the author wants to create a link containing particular words that she uses for capturing the reader interest. The fact that it refers to an internal and relevant object doesn't mean it is a cross-reference. Actually, it doesn't at all.

I may understand that a proliferation of multiple doc-*ref (one for each sub-component one would like to cross-reference) can be a problem. However this problem could be bypassed by simply making available just one role for cross-referencing, let's say a generic "doc-xref", that allow one to point to any document sub-component, leaving the actual semantics behind such cross-reference conveyed by the the pointed sub-component itself. In this case, if I have

<a href="#bibref-1 role="doc-xref"></a>
...
<li id="bibref-1" role="doc-biblioentry">

it would be a cross-reference to a bibliographic entry (e.g., rendered as "[1]"), if I have

<a href="#note-1 role="doc-xref"></a>
...
<section id="note-1" role="doc-footnote">

it would be a cross-reference to a footnote (e.g., rendered as "^1"), if I have

<a href="#section-1 role="doc-xref"></a>
...
<section id="section-1">

if would be a cross-reference to a section (e.g., rendered as "Section 1" or with the title of the section), and so on.

This approach would be also very convenient, since it would allow one to refer to all the object of the documents (even figures, formulas, etc.) without adding a specific "ref" role for each kind of reference.

I apologise for the long email, and hope that my personal point of view on this topic would be helpful in the current discussion.

Have a nice day :-)

S.


----------------------------------------------------------------------------
Silvio Peroni, Ph.D.
Department of Computer Science and Engineering
University of Bologna, Bologna (Italy)
Tel: +39 051 2094871
E-mail: silvio.peroni@unibo.it
Web: http://www.essepuntato.it
Twitter: essepuntato
Received on Tuesday, 17 November 2015 11:13:33 UTC