From: Bill Kasdorf <bkasdorf@apexcovantage.com>
Date: Wed, 5 Aug 2015 15:21:05 +0000
All excellent points, Nick—thanks!

I particularly like the "scrolling" vs. "flipping" distinction. I will steal that!


Sorry for the late comments, I struggled to find time to sit down and read through everyone's great discussion.  Since there are a lot of great gems here, I wanted to mainly summarize, but also offer some of my own commentary.

Some of the ways that the "page" concept has been approached is a simulacrum to the physical book - but I think the abstract goes a whole step farther.  Keep in mind that in the trade world, you can have a paperback and hardcover of the SAME content that have different "page" breakdowns.  Long before CSS, publishers were making use of media-queries to structure content.  The page is the way in which content is optimally laid out for the medium on which it is being displayed.

We could even call the "page" the Display-style of structured content for a given medium.  From this perspective, we account for some content being "scrolling" and some being "flipped" (what some call paginated - but I'm trying to avoid using the overloaded word).  Not only are device SIZES taken into account, but also device capabilities.  Scrolling on a non-touch non-mouse device might be unreasonable.  And - in the future when there are new input devices, it would be useful to have a spec that took that into account.

Content Placement Markers
There is a clear need - at the moment - for markers denoting physical pages for the needs of accessibility - but as noted in the wonderful comments here, what we really need is just a way to point to a specific point in the text.  As long as we have a good location identifier of sorts, then we can solve this problem and it becomes an issue of authors/authoring tools to add these links.

Fixed Layout
Given the above definition of the Page/Pagination - fixed layout becomes just another render option.  Take a cookbook.  A publisher may decide to have a "recipe" template that shows a picture on the left, ingredients on the right, and a description below - all absolutely positioned on the canvas.  They could define this layout for devices that are 1024x768 pixels or more.  But, they could also define a layout for small devices that are reflowable - show an image, ingredients below, then instructions.  To have an entirely different package or content form for fixed layout is completely absurd (and I should know, as I've created an authoring tool solely FOR fixed-layout content).

Thinking abstract about structured content
In the ideal - and I do believe many publishers think about things like this - is to think about things in terms of content.  The display of the content is part of productization, but it is different for different mediums (hardcover, paperback, ebook, webook, etc.)  I'm not sure publishers are masters-of-the-web and know exactly what is best for them, and that is where we have the opportunity to - given our knowledge of the web and CSS - provide a guideline for how content should be structured to maximize layout on multiple devices while not losing accessibility.  I believe this CAN be achieved.

I hope this makes sense - I was up late last night....


Good point! Thanks for the reminder; that's an aspect I often forget to mention.

One related comment: the motivation between those author-driven page breaks is usually relationship/association/containment-driven: "keep this with that," "the reader/user/student needs to see this, that, and this other thing at the same time," etc.

JATS/BITS, the dominant XML model in the scholarly publishing world, has a pair of <milestone> elements for precisely that purpose. Every <milestone> has a @rationale attribute for saying what it's for, and there are actually two such elements, <milestone-start/> and <milestone-end/>, empty marker elements that enable getting around the well-formedness barrier often encountered in this context, where the starting and ending points are at arbitrary locations that don't necessarily align with the logical structure of the document. Very useful.

Thanks again for mentioning the author-driven breaks!

--Bill K

Bill, as with various times in this thread, I completely agree with you concerning references into content.

However, I think there is a part of this that is being overlooked and is quite important.  Author-forced page breaks due to (usually but not always) changes in some semantic element.  In a book, this would be chapter breaks (so a new chapter starts at the top of a new page), but in a magazine it could be an article, or before tables in a scientific paper or …   Fortunately, there is work being done in this area in CSS.  Unfortunately, there is zero support amongst the various UA’s to support it.


>To be clear, since I've long been an advocate of page markers: I view them as necessary _at the present time_ but a blunt and arbitrary instrument.
>I would even go so far (perhaps surprising, coming from me) to call them brain-dead. The reason is that they have absolutely no relationship to the inherent structure and content of the document. They just mark where pages happened to break, in a particular rendering, which is based on a designer having decided on a certain complement of specifications (page size, margins, font size, leading, spacing, etc.) that resulted in a certain amount of content fitting on a given page, having started at an arbitrary point based on what happened to fit on the previous page. That's all they are. But we need the dumb things.
>Nobody, certainly not me, would argue that this is the best way to embed reference points in a document. In fact the activity that I am nominally in charge of on the W3C DPUB IG (nominally because I completely depend on Ivan for the heavy lifting brain work and knowledge base) involves looking at how to identify fragments within a document, _without_ having to embed reference points at all. (Sidenote: this fragment identification actually requires two locations, either explicit or implied. Mostly implied, because if you can reference the actual structure of the document, then what you are really doing is pointing at the starting location of the fragment, the ending location of which is typically provided by markup, e.g. the end of a <section> or <p> or <span>. But you also have to be able to define fragments that aren't neatly aligned with the document structure--either because they're smaller than the markup delineates or because they overlap the structural components of the document. That's where the fun starts.)
>But I digress. The point is that OF COURSE it is better for the reference points in the document to be based, at least as a foundation, on the document structure, down to the paragraph level and any phrase-level markup that paragraphs might contain. Duh!
>The reality of this particular time in history, however, is that there are still too many situations where the reference to the authoritative paginated version is still relied upon (back-of-the-book indexes, scholarly citations, cross references, teachers saying "turn to page 53," etc.). So we are stuck with our brain-dead page markers. Would we like something better? You bet! We're working on that. But in the meantime we still have to have those damn page break markers. It's called the real world.
>--Bill K
>> >
>> I like this differentiation, and I would think that #2 is indeed very important but we may want to, eventually, completely dissociate it from the concept of paging.
>> My understanding is that publishers, these days, put some sort of a page mark into the digital output (in the form of an invisible element, of a metadata, etc.). The purpose of this is to be able to *link* (either conceptually or through real hyperlinks) into the document. It is obviously important for various use cases that came up in this thread already (and others) like academic reference or classroom usage. But, just as you say, handling these may require scrolling and that because the concept of these anchors are, actually, orthogonal to display, ie, pages in terms of #1 and #3.
>> I think there is an interesting discussion to have on where anchors should be put, what is the granularity of those, can (in future) some sort of a robust anchoring approach take over the need for these anchors, etc. It is largely a usability issue, but I think it is better if we separate it from the concept of pagination…
>> [...]
>> I think we would make progress so much faster if we could refer to chunks of information (e.g. paragraphs, as lawyers do now) rather than the 100s of years old physical page model, which is a really too big a target anyway. But the publishing industry is not exactly forward looking, so I guess we'll be stuck for a few more decades with having physical pages as the "version of record". :-(
>Let us be optimistic!:-)
>Whether paragraphs are the right chunks, or sections, or something else: I really do not know. Actually… there may not be a universal answer: what is necessary for legal documents (and which would probably work well for, say, scholarly publishing, eg, in humanities, where page references are ubiquitous) may be an overkill for novels. (Think of the number of references you would have in the "War and Peace" :-)
>But yes, putting markers (and/or providing means to links) into chunks of information is the right abstraction.
- Nick Ruffilo


