RE: Prioritisation from Bill Kasdorf on 2015-08-05 (public-digipub@w3.org from August 2015)

From: Bill Kasdorf <bkasdorf@apexcovantage.com>
Date: Wed, 5 Aug 2015 14:00:23 +0000
To: Ivan Herman <ivan@w3.org>, Kaveh Bazargan <kaveh@rivervalleytechnologies.com>
CC: Leonard Rosenthol <lrosenth@adobe.com>, Johannes Wilm <johanneswilm@vivliostyle.com>, Dave Cramer <dauwhe@gmail.com>, "W3C Digital Publishing Discussion list" <public-digipub@w3.org>, Matthew Hardy <mahardy@adobe.com>
Message-ID: <CO2PR06MB57259DD4142596E869E4482DF750@CO2PR06MB572.namprd06.prod.outlook.com>
To be clear, since I've long been an advocate of page markers: I view them as necessary _at the present time_ but a blunt and arbitrary instrument.

I would even go so far (perhaps surprising, coming from me) to call them brain-dead. The reason is that they have absolutely no relationship to the inherent structure and content of the document. They just mark where pages happened to break, in a particular rendering, which is based on a designer having decided on a certain complement of specifications (page size, margins, font size, leading, spacing, etc.) that resulted in a certain amount of content fitting on a given page, having started at an arbitrary point based on what happened to fit on the previous page. That's all they are. But we need the dumb things.

Nobody, certainly not me, would argue that this is the best way to embed reference points in a document. In fact the activity that I am nominally in charge of on the W3C DPUB IG (nominally because I completely depend on Ivan for the heavy lifting brain work and knowledge base) involves looking at how to identify fragments within a document, _without_ having to embed reference points at all. (Sidenote: this fragment identification actually requires two locations, either explicit or implied. Mostly implied, because if you can reference the actual structure of the document, then what you are really doing is pointing at the starting location of the fragment, the ending location of which is typically provided by markup, e.g. the end of a <section> or <p> or <span>. But you also have to be able to define fragments that aren't neatly aligned with the document structure--either because they're smaller than the markup delineates or because they overlap the structural components of the document. That's where the fun starts.)

But I digress. The point is that OF COURSE it is better for the reference points in the document to be based, at least as a foundation, on the document structure, down to the paragraph level and any phrase-level markup that paragraphs might contain. Duh!

The reality of this particular time in history, however, is that there are still too many situations where the reference to the authoritative paginated version is still relied upon (back-of-the-book indexes, scholarly citations, cross references, teachers saying "turn to page 53," etc.). So we are stuck with our brain-dead page markers. Would we like something better? You bet! We're working on that. But in the meantime we still have to have those damn page break markers. It's called the real world.

--Bill K

-----Original Message-----
From: Ivan Herman [mailto:ivan@w3.org] 
Sent: Wednesday, August 05, 2015 5:58 AM
To: Kaveh Bazargan
Cc: Leonard Rosenthol; Johannes Wilm; Bill Kasdorf; Dave Cramer; W3C Digital Publishing Discussion list; Matthew Hardy
Subject: Re: Prioritisation


> On 05 Aug 2015, at 11:45 , Kaveh Bazargan <kaveh@rivervalleytechnologies.com> wrote:
> 
> 
> 
> On 5 August 2015 at 10:36, Ivan Herman <ivan@w3.org> wrote:
> Leonard,
> 
> > On 04 Aug 2015, at 21:38 , Leonard Rosenthol <lrosenth@adobe.com> wrote:
> >
> > With the focus here on terminology, I think that we also need to be careful about what the definition of a “page” is in this context.
> >
> > In reading over the various messages here, I see (at least) three different definitions.
> >
> > 1 – The content that fits on the device’s screen/output without requiring any scrolling.
> > 2 – The content that maps to a semantic concept in the publication (eg. Index, chapter, article, etc.) and may require scrolling
> > 3 – The content that maps to the printed or fixed layout representation.
> >
> 
> I like this differentiation, and I would think that #2 is indeed very important but we may want to, eventually, completely dissociate it from the concept of paging.
> 
> My understanding is that publishers, these days, put some sort of a page mark into the digital output (in the form of an invisible element, of a metadata, etc.). The purpose of this is to be able to *link* (either conceptually or through real hyperlinks) into the document. It is obviously important for various use cases that came up in this thread already (and others) like academic reference or classroom usage. But, just as you say, handling these may require scrolling and that because the concept of these anchors are, actually, orthogonal to display, ie, pages in terms of #1 and #3.
> 
> I think there is an interesting discussion to have on where anchors should be put, what is the granularity of those, can (in future) some sort of a robust anchoring approach take over the need for these anchors, etc. It is largely a usability issue, but I think it is better if we separate it from the concept of pagination…
> 
> [...]
> 
> I think we would make progress so much faster if we could refer to chunks of information (e.g. paragraphs, as lawyers do now) rather than the 100s of years old physical page model, which is a really too big a target anyway. But the publishing industry is not exactly forward looking, so I guess we'll be stuck for a few more decades with having physical pages as the "version of record". :-(

Let us be optimistic!:-)

Whether paragraphs are the right chunks, or sections, or something else: I really do not know. Actually… there may not be a universal answer: what is necessary for legal documents (and which would probably work well for, say, scholarly publishing, eg, in humanities, where page references are ubiquitous) may be an overkill for novels. (Think of the number of references you would have in the "War and Peace" :-)

But yes, putting markers (and/or providing means to links) into chunks of information is the right abstraction.

Ivan


> 
> 
> --
> Kaveh Bazargan
> Director
> River Valley Technologies
> @kaveh1000
> +44 7771 824 111
> www.rivervalleytechnologies.com
> www.bazargan.org


----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/

mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Wednesday, 5 August 2015 14:00:54 UTC