Re: [Minutes] EPUB Virtual Locator TF, 2021-06-02

Administrative hat on exclusively here.

The Locator meetings take place every week at alternating times, 11AM ET and 8PM ET (Wednesdays). We did this hoping to accommodate as many time zones as possible.

I don’t think we can negotiate the 8PM timeslot as that is pretty fixed, but if there’s an earlier time on the alternating weeks that works for Europe, I’ll make the change. I can’t change the day of the week due to conflicts with other WG meetings. I’m sorry there seems to be an abundance of taskforces, but it seems to be the best way to tackle some issues the working group can’t focus on exclusively. We have switched FXL Accessibility to bi-weekly due to the cadence of work, and we’ll adjust other meetings to reduce the load when we can. Believe me, I know a lot is happening and we’re doing our best to balance it all.

-Wendy


From: Hadrien Gardeur <hadrien@demarque.com>
Date: Friday, June 4, 2021 at 5:40 AM
To: Dan Lazin <dlazin@google.com>
Cc: Laurent Le Meur <laurent.lemeur@edrlab.org>, Reid, Wendy <wendy.reid@rakuten.com>, W3C EPUB 3 Working Group <public-epub-wg@w3.org>
Subject: Re: [Minutes] EPUB Virtual Locator TF, 2021-06-02
Hello Dan,

Regarding the vocabulary, I completely agree that "pages" shouldn't be used outside the context of print pages, as defined in the page list.

In the Readium community, we've been discussing the right terminology for over three years now and although we've been using "positions" for what this group refers to as a "virtual locator", there's a regular pushback from external developers. I believe that the main issue is that end-users are not familiar with any other term than page.

Regarding the counting algorithm, we've also had back and forth discussions over this for years within Readium. Since a lot of implementers in our community previously relied on RMSDK for EPUB rendering, we've mostly aligned with what Adobe has been using since 2007: 1024 bytes per position.
Even with such a straightforward approach, it still raises a number of questions when we implement it:

  *   Should you calculate this based on the compressed or deflated size of the resource?
  *   What about encrypted resources?
  *   From the context of a webview, how can you calculate the equivalent of this progression in bytes?
Might I suggest that it sounds like you would enjoy a few task force meetings? :)

I would also be interested in participating in these calls but this has been very difficult to achieve:

  *   a number of these TF calls are in the middle of the night for people living in Europe
  *   there are too many different TF calls in a given week
I know that Laurent and other key members of the Readium community (Daniel for example) are in the same situation.

We've been trying to keep up with this group mostly through the meeting notes, but we'd love to figure out an easier way to interact.

Among other things, we've been worried to see so many mentions about CFI. I personally think that this ship has sailed and that we should instead align with the Web and the work done for example on text fragments (other Web-based solutions have followed that approach, for example Hypothesis).

While CFI/XPath and other tree-based approaches have the benefit of pointing very precisely in a document, they're also extremely fragile and can be expensive to compute in many scenarios.

We've favored instead:

  *   URLs (instead of an index in the spine)
  *   text (including surrounding text)
  *   and media-specific fragments
Best,
Hadrien


Le jeu. 3 juin 2021 à 23:51, Dan Lazin <dlazin@google.com<mailto:dlazin@google.com>> a écrit :
+Wendy for visibility

We haven't gotten that far yet, but my impression of the direction we're heading in is that (perhaps) reading systems would continue to use their existing counting algorithms for the time being, but we might suggest that the results be renamed — for example, "screens" instead of "pages." As an example, Apple Books already distinguishes between pages and screens; in a book that has a page-list, you can tap to switch between screen counts (which recalculates upon reflow) and page counts (which doesn't).

Might I suggest that it sounds like you would enjoy a few task force meetings? :)




On Jun 3, 2021, at 12:50 PM, Laurent Le Meur <laurent.lemeur@edrlab.org<mailto:laurent.lemeur@edrlab.org>> wrote:

Ok then we (Readium developers) can help. The next question is: do we agree that reading systems which are well known on the market but will not change their algorithm (because they are legacy, because it would be a breaking change for them ...) may not support this new standard, but ... well this is life?

L


Le 3 juin 2021 à 17:37, Dan Lazin <dlazin@google.com<mailto:dlazin@google.com>> a écrit :

Hey, Laurent. We are indeed talking (talking) about standardizing the algorithm. The short version is "use page-list if present, and if not do something dead-simple like divide by 1000."

We're still pretty far from writing a spec, but we are talking about standardization here.



On Jun 3, 2021, at 9:55 AM, Laurent Le Meur <laurent.lemeur@edrlab.org<mailto:laurent.lemeur@edrlab.org>> wrote:

Hi everybody,

Sorry for not having been able to participate to the call.

About use case line 2 ("A teacher wants to ask students to go to a certain location in an EPUB which contains no explicit page-list. The students are using different types of reading systems, nevertheless all reach the same page. ")

We currently are working in this area in the Readium Developers' community. I don't want to be pessimistic but I believe this will not happen. If page lists are present, ok the mechanism is documented and it is not about virtual locators, but UX and the ability to jump to a location identified by an html fragment id. But if no page lists are present, each reading system has its recipe to calculate "positions" (as we call it at Readium) aka virtual page numbers. "positions" are calculated per resource first, then agglomerated to form a sequence. For instance it may be the size of the compressed file (in the zip) divided by 1024. Or the size of the (decompressed) html content divided by 2500. Either this group wants to standardize the algorithm, or the use case is IMHO void.

Best regards
Laurent



Le 3 juin 2021 à 15:23, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> a écrit :

Minutes are here:

https://www.w3.org/publishing/groups/epub-wg/Meetings/Minutes/2021-06-02-epub-locators

Ivan

----
Ivan Herman, W3C
Home: http://www.w3.org/People/Ivan/
mobile: +33 6 52 46 00 43
ORCID ID: https://orcid.org/0000-0003-0782-2704

Received on Friday, 4 June 2021 16:17:56 UTC