FW: Announcing the ebook augmentation system

Hello Publishing groups,

Sending this to the Publishing, business, community, and working groups,
sorry for the cross posting.


-----Original Message-----
From: technical-developments@lyris.dundee.net
<technical-developments@lyris.dundee.net> On Behalf Of Willem van der Walt
Sent: Wednesday, May 5, 2021
To: Technical Developments Discussion List
Subject: Announcing the ebook augmentation system

Good day,
Please feel free to redistribute to individuals or organizations you think
might be interested.
My appologies if this is somewhat off topic for this list.
Kind regards, Willem

???Announcing the ebook augmentation system

The Voice Computing Research Group at the Council for Scientific and
Industrial Research in South Africa has developed a system
  which automates the addition of either human-narrated or synthesized
speech to a standard EPUB 3 publication.
The system automates the alignment of the speech to the text of the book at
paragraph, sentence and word level, allowing the eventual user to switch
among these levels of granularity on the fly.

In short, the user of the system pushes in the standard EPUB 3 and,
optionally, the pre-existing human-narrated audio files to get out an EPUB 3
with the synchronized audio included.
When no human-narrated audio is available, the text of the book is
synthesized, and the synthesized audio is added instead.
The system also allows for a combination of human-narrated and synthesized
It is operated through a user-friendly web-based interface.

Multilingual support
As the definition of the main language of standard EPUB 3 files is often
incorrectly specified in practice, the user selects the main language before
the processing starts. When passages in one or more foreign languages exist
in the book, these must be marked up in the input EPUB 3 file with the
xml:lang tag.
This is required to ensure proper alignment of the audio with the text and,
in the case where speech has to be synthesized, to enable the automatic
selection of a TTS voice in the correct language.

By default, the Qfrency Text-to-Speech (TTS) synthesizer (also developed by
the same research group) is used, with fallback to the open-source Espeak
synthesizer when a language not supported by Qfrency TTS is encountered.
With some customization, it is possible to use other TTS engines as well. As
a working example of this, the open-source RHVoice
  TTS engine was implemented.

Input and output formats
Once a book is augmented with audio, the following output formats are
available for download:
1. An EPUB 3 file with the audio added and synchronized to the text.
2. In the case of synthesized audio, a ZIP file containing a set of MP3
files with just the audio.
3. A ZIP file containing a portable embosser format (PEF) file for Braille
This is obtained by running the DAISY Pipeline2 product in the background.
4. A ZIP file, also produced by the Pipeline2 product, with a DAISY 2.02
version of the book.
At the time of writing, the latter still has some issues which will
hopefully be resolved soon.

As a convenience to the user, a simple web-based interface is provided for
conversion of other input formats into EPUB 3.
Currently, DOCX and PDF can be converted.

Reading the resulting EPUB 3 books
The books can be read using any EPUB 3 reader that supports media overlays.
With the exception of Readium, most of the ones available, however, do not
support the on-the-fly changing of granularity. We have an EPUB 3 reader
which is in beta.
It supports the multi-level granularity (paragraph, sentence and word) and
has some additional features like search, repeat etc. It currently runs on
Android and Windows, with iOS in the pipeline.

Invitation to pilot the system
We want to extend an invitation to interested providers of accessible
reading material and publishers world-wide, to contact us
  to participate in piloting the system.

Each participating organization will receive one or more accounts on the
system through which it will be able to upload its books for augmentation
securely. The books uploaded and processed by each account are only visible
to that account holder.
During the pilot phase, the system will run on our servers with the
aforementioned TTS engines as options. Other implementation
  options will be available when the system goes into production. Our EPUB 3
reader will also be provided to the participants.

To participate in the pilot, please email a request to: Ilana Wilken
<iwilken@csir.co.za> with the subject:
"Ebook augmentation system: international pilot". In the body of the email,
please provide the names and email addresses of the individuals in the
organization who will require accounts. Optionally, indicate the language(s)
which you would like to process with the system. For technical enquiries
about the system itself, please send an email to:
Willem van der Walt <wvdwalt@csir.co.za>.

We would like to achieve the following objectives through the pilot:
1. More real-world books through the system, with feedback from real-world
users on both the system and on the resulting output books.
2. Suggestions on the prefered business model, e.g. a once-off license or
annual subscription license, maintenance and support,
3. Whether you would prefer to run the system externally over the internet
through a web interface (like in the pilot) or internally on your own
4. Feedback on the usability of our EPUB 3 reader.
5. Any other suggestions or comments that you think are relevant.

EPUB 3 with media overlays has a lot of potential, in particular in the
education setting. Producing such books, however, is a complex process. We
believe that our system reduces this complexity to a level where many more
organizations will find it feasible to produce such books. Therefore, we
hope for a positive response from the community.

Received on Wednesday, 5 May 2021 14:41:37 UTC